Choices Matter When Training Machine Learning Models for Return Prediction
研究指出,在训练机器学习模型预测股票收益时,忽视数据异质性或样本不平衡会导致效果不佳。通过训练分组模型和预测相对收益,能显著提升价值加权交易策略的经济表现,为研究者提供建模指导。
Applying machine learning to cross-sectional stock return prediction requires careful consideration of modeling choices. Common approaches that fail to account for heterogeneity or imbalanced stock representation in training data can lead to suboptimal performance. I study two strategies to address these issues: training group-specific models and predicting relative returns. Both approaches yield similar economic improvements over models trained on the full cross-section of US stock returns, with value-weighted trading strategies benefiting significantly. The findings underscore the importance of aligning machine learning modeling decisions with desired economic outcomes and provide guidance for researchers and practitioners seeking robust machine learning models.