Rethinking Variable Importance in Machine Learning: An Economic Perspective on Empirical Asset Pricing
研究了哪些公司特征驱动机器学习投资组合的经济价值,发现样本内变量重要性存在过拟合,微盘股会扭曲结果,且部分预测因子具有负重要性,剔除它们能提升风险调整收益。
We study which firm characteristics drive the economic value of machine learning portfolios. Three results stand out. First, in-sample variable importance overfits and provides little reliable guidance, highlighting the need for out-of-sample evaluation using economic criteria. Second, conventional models are dominated by microcaps, which inflate returns and concentrate gains in costly-to-trade stocks; excluding microcaps is essential for meaningful inference. Third, some predictors carry negative importance and consistently degrade performance; removing them improves risk-adjusted returns and clarifies which characteristics matter. These findings show that only with economic restrictions can machine learning deliver robust asset pricing insights.