一种用于高维信用风险评估的混合聚类与提升树特征选择方法

A hybrid clustering and boosting tree feature selection (CBTFS) method for credit risk assessment with high-dimensionality

Technological and Economic Development of Economy · 2025
被引 2
人大 A-

中文导读

提出一种混合聚类与提升树的特征选择方法,先用改进的最小生成树去除冗余特征,再用随机森林、XGBoost和AdaBoost进一步排序,在真实信用数据集上验证了该方法优于经典特征选择方法。

Abstract

To solve the high-dimensional issue in credit risk assessment, a hybrid clustering and boosting tree feature selection method is proposed. In the hybrid methodology, an improved minimum spanning tree model is first used to remove redundant and irrelevant features. Then three embedded feature selection approaches (i.e., Random Forest, XGBoost, and AdaBoost) are used to further enhance the feature-ranking efficiency and obtain better prediction performance by applying the optimal features. For verification purpose, two real-world credit datasets are used to demonstrate the effectiveness of the proposed hybrid clustering and boosting tree feature selection (CBTFS) methodology. Experimental results demonstrated that the proposed method is superior to others classic feature selection methods. This indicates that the proposed hybrid clustering and boosting tree feature selection method can be used as a promising tool for solving high-dimensional issue in credit risk assessment. First published online 12 February 2025

信用风险评估特征选择高维数据聚类提升树