🌙

文本分析与信用评分:一种新的矩阵分解方法

Textual analysis and credit scoring: a new matrix factorization approach

Journal of the Operational Research Society · 2024
被引 2
ABS 3

中文导读

研究从贷款陈述中提取文本变量,结合数值数据构建信用评分模型,发现基于最优截断逻辑回归的模型准确率最高,且比深度学习方法更具可解释性。

Abstract

Credit scoring models are important for financial institutions’ credit decisions. This study examined how variables are extracted from loan statements and whether textual variables can improve the accuracy of the default model. We used a combination of forward selection and non-negative matrix factorization to extract variables from loan statements. We also built a credit scoring model using both loan statement and numerical data. The results show that in the comparative analysis, the credit scoring model built using the optimal cut-off logistic regression model and the two types of data had the highest accuracy. Moreover, compared with the credit scoring model constructed using the deep learning method based on word vectors, the credit scoring model in this study had better interpretation. The regression analysis revealed that the variables from the loan statement have a significant effect on the default status.

信用评分文本分析矩阵分解机器学习