Unsupervised Feature Selection With Weighted and Projected Adaptive Neighbors
针对高维数据中噪声特征导致图结构不可靠的问题,提出一种同时学习图结构、特征权重和投影矩阵的无监督特征选择模型,使图结构基于加权投影后的数据构建,并约束图恰好包含c个连通分量。
In the field of data mining, how to deal with high-dimensional data is a fundamental problem. If they are used directly, it is not only computationally expensive but also difficult to obtain satisfactory results. Unsupervised feature selection is designed to reduce the dimension of data by finding a subset of features in the absence of labels. Many unsupervised methods perform feature selection by exploring spectral analysis and manifold learning, such that the intrinsic structure of data can be preserved. However, most of these methods ignore a fact: due to the existence of noise features, the intrinsic structure directly built from original data may be unreliable. To solve this problem, a new unsupervised feature selection model is proposed. The graph structure, feature weights, and projection matrix are learned simultaneously, such that the intrinsic structure is constructed by the data that have been feature weighted and projected. For each data point, its nearest neighbors are acquired in the process of graph construction. Therefore, we call them adaptive neighbors. Besides, an additional constraint is added to the proposed model. It requires that a graph, corresponding to a similarity matrix, should contain exactly c connected components. Then, we present an optimization algorithm to solve the proposed model. Next, we discuss the method of determining the regularization parameter γ in our proposed method and analyze the computational complexity of the optimization algorithm. Finally, experiments are implemented on both synthetic and real-world datasets to demonstrate the effectiveness of the proposed method.