Forecasting the equity premium: can machine learning beat the historical average?
用多种机器学习方法预测股权溢价,发现样本外预测无法超越历史平均值,归因于数据量小和信噪比低,并识别出债券利率相关变量为最重要预测因子。
We empirically predict the equity premium with the selected machine learning methods in Gu et al. (Empirical asset pricing via machine learning. Rev. Financ. Stud., 2020, 33(5), 2223–2273). We also consider four additional popular machine learning methods (ridge regression, support vector regression, k-nearest neighbors, and extreme gradient boosted trees) and their combination method. Using a dataset of both macroeconomic and technical predictors, we find that despite showcasing strong in-sample forecasting abilities, particularly with tree-based models, the out-of-sample results support Welch and Goyal (A comprehensive look at the empirical performance of equity premium prediction. Rev. Financ. Stud., 2008, 21(4), 1455–1508) that the competing forecasting models generally fail to outperform the historical average benchmark. We attribute this failure to the small dataset size and the low signal-to-noise ratio inherent in equity premium prediction. Our variable importance analysis further identifies three bond interest rate-related variables as the most dominant predictors for the equity premium. The economic value from a market timing perspective highlights that the historical average benchmark strategy generates the highest average return of 25.63% and the best Sharpe ratio of 0.8. Finally, our findings are robust across a variety of settings.