🌙

基于动态特征聚类与粒子群优化的流式特征选择方法

A Streaming Feature Selection Method Based on Dynamic Feature Clustering and Particle Swarm Optimization

IEEE Transactions on Evolutionary Computation · 2024
被引 26
ABS 4

中文导读

提出一种三阶段流式特征选择方法,先在线去除无关特征,再用动态聚类分组冗余特征,最后用整数粒子群优化搜索最优子集,实验表明分类效果更好且时间合理。

Abstract

Feature selection (FS) is an effective data preprocessing technique. In some practical applications, features may continuously arrive one by one or by groups, and we cannot know the exact number of features before learning. Streaming FS (SFS) aims to remove redundant and irrelevant features from the continuously arriving features. This article proposes a three-stage SFS method based on dynamic feature clustering and particle swarm optimization (SFS-DPSO). In the first stage, an online relevance analysis is utilized to quickly remove irrelevant features, reducing the size of newly arrived feature groups. In the second stage, a dynamic feature clustering technique is employed to divide redundant features into different groups, thereby reducing the search space for subsequent evolutionary algorithms. In the third stage, a historical information-driven integer particle swarm optimization algorithm is exploited to search for optimal feature subset in the clustered feature space. The proposed algorithm is applied in 12 typical datasets with different difficulty levels and a real-word case, experimental results show that it can achieve better-classification results in a reasonable time and is superior to most existing algorithms.

特征选择粒子群优化流式数据数据预处理机器学习