超越早期预警指标:高中辍学与机器学习

Beyond Early Warning Indicators: High School Dropout and Machine Learning

Oxford Bulletin of Economics and Statistics · 2018
被引 69 · 同刊同年前 7%
人大 AABS 3

中文导读

结合机器学习与经济理论,仅用九年级信息预测高中辍学,发现传统预警系统效果差,而支持向量机等工具能提高预测精度,并考虑政策目标和预算约束来校准模型。

Abstract

Abstract This paper combines machine learning with economic theory in order to analyse high school dropout. It provides an algorithm to predict which students are going to drop out of high school by relying only on information from 9th grade. This analysis emphasizes that using a parsimonious early warning system – as implemented in many schools – leads to poor results. It shows that schools can obtain more precise predictions by exploiting the available high‐dimensional data jointly with machine learning tools such as Support Vector Machine, Boosted Regression and Post‐LASSO. Goodness‐of‐fit criteria are selected based on the context and the underlying theoretical framework: model parameters are calibrated by taking into account the policy goal – minimizing the expected dropout rate ‐ and the school budget constraint. Finally, this study verifies the existence of heterogeneity through unsupervised machine learning by dividing students at risk of dropping out into different clusters.

高中辍学预测机器学习早期预警指标异质性聚类