Credit scoring model based on a novel group feature selection method: The case of Chinese small-sized manufacturing enterprises
研究提出一种基于组特征选择的信用评分模型,通过最大化基尼系数选择特征子集并去除冗余,实证表明该方法比单个特征选择更能有效识别违约状态,且特征过多反而降低判别力。
In building a predictive credit scoring model, feature selection is an essential pre-processing step that can improve the predictive accuracy and comprehensibility of models. In this study, we select the optimal feature subset based on group feature selection in lieu of the individual feature selection method, to establish a credit scoring model for small manufacturing enterprises. In our methodology, we first select a group of features using the 0-1 programming method, with the objective function of maximising the Gini coefficient (GINI) of the credit score to identify the possibility of default. Then we introduce constraints to remove any redundant features in the same subset, provided they reflect the same information. Finally, we assign weights to different features according to the Gini coefficient, ensuring that the weight of the features reflects their discriminatory power. Our empirical results show that the selection of a set of features more effectively identifies default status than the individual feature selection approach. Moreover, a rating system with more features does not necessarily have better discriminatory power. As the number of features exceeds the optimum number of features selected, the system's discriminatory ability begins to decrease.