一种可解释的机器学习方法,用于从复杂数据集中生成交互效应假设

An interpretable machine learning methodology to generate interaction effect hypotheses from complex datasets

DECISION SCIENCES · 2024
被引 7
人大 AABS 3

中文导读

提出一种名为SIFT的新方法,帮助解释机器学习模型,识别变量间的交互效应,提升模型透明度和对数据生成过程的理解。

Abstract

Abstract Machine learning (ML) models are increasingly being used in decision‐making, but they can be difficult to understand because most ML models are black boxes, meaning that their inner workings are not transparent. This can make interpreting the results of ML models and understanding the underlying data‐generation process (DGP) challenging. In this article, we propose a novel methodology called Simple Interaction Finding Technique (SIFT) that can help make ML models more interpretable. SIFT is a data‐ and model‐agnostic approach that can be used to identify interaction effects between variables in a dataset. This can help improve our understanding of the DGP and make ML models more transparent and explainable to a wider audience. We test the proposed methodology against various factors (such as ML model complexity, dataset noise, spurious variables, and variable distributions) to assess its effectiveness and weaknesses. We show that the methodology is robust against many potential problems in the underlying dataset as well as ML algorithms.

计算机科学人工智能机器学习可解释性