基于高维数据的机器学习预测汇总与缩尾处理在股票收益预测中的应用

Pooling and winsorizing machine learning forecasts to predict stock returns with high-dimensional data

Journal of Empirical Finance · 2024

被引 5

人大 BABS 3

Erik Mekelburg · 丹佛大学
Jack Strauss · 丹佛大学

中文导读

使用多种机器学习模型预测美国市场股票收益，发现单独模型预测误差大，但通过缩尾处理和汇总模型预测能获得稳定的样本外预测能力，并在多国数据中验证了汇总的重要性。

Abstract

We evaluate US market return predictability using a novel data set of several hundred ag- gregated firm-level characteristics. We apply LASSO, Elastic Net, Random Forest, Neural Net, Extreme Gradient Boosting, and Light Gradient Boosting Machine methods and find these models experience large prediction errors that lead to forecast failures. However, winsorizing and pooling machine learning model forecasts provides consistent out-of-sample predictability. To assess robustness, we apply machine learning methods to high-dimensional data for Canada, China, Germany and the UK as well as the Goyal-Welch data. All machine learning models we consider, except for the ensemble pooled methods, fail to significantly predict returns across our samples, highlighting the importance of pooling, evaluating additional economies, and the fragility of individual machine learning methods. Our results sheds light on the sparsity versus density debate as the degree of sparsity and variable importance evolves over time.

金融经济学机器学习资产定价预测方法实证金融

阅读原文 ↗