一种基于多级子模型迁移的可解释拒绝推断标注模型在信用风险评估场景中的应用

An interpretable labeling model for reject inference based on multi-level sub-model migration in the credit risk assessment scenario

Quantitative Finance · 2026

被引 0 · 同刊同年前 7%

人大 BABS 3

Zhang Runchi · 南京邮电大学通讯

中文导读

针对贷款机构未批准样本缺乏标签导致的选择偏差问题，提出LM-IMM模型，通过多级可解释子模型迁移为拒绝样本标注，在保持可解释性的同时实现高标注准确率，并显著提升后续信用风险评估模型性能。

Abstract

The ‘rejected loan' samples, which the loan institution has not approved, often contain a substantial amount of credit risk information. However, due to the absence of labeling, they cannot be fully utilized in the credit risk assessment process, thus resulting in sample selection bias issues. Concurrently, most extant representative reject inference achievements are machine learning models with pronounced ‘black box' characteristics, making it challenging to satisfy the interpretability criteria of regulators. In this paper, we developed the LM-IMM model for labeling ‘rejected loan' samples with interpretable features and a multi-level sub-model transfer mechanism. The LM-IMM model first establishes interpretable heterogeneity sub-models in multiple levels to ascertain the pivotal risk characteristics of the labeled samples. Subsequently, it selects the most suitable sub-model for labeling the ‘rejected loan' samples according to their resemblance to the clusters of labeled samples. After demonstrating that the LM-IMM model has good interpretability, the empirical research indicates that the proposed model can achieve a high labeling accuracy. Furthermore, 10 representative credit risk assessment models constructed on datasets integrating newly labeled samples demonstrate significant performance improvements. Additionally, the LM-IMM model performs robustly when different numbers of clusters are set and on datasets with different sample balancing features.

信用风险评估拒绝推断可解释机器学习样本选择偏差

阅读原文 ↗