Choosing among Regularized Estimators in Empirical Economics: The Risk of Machine Learning
为实证经济学研究者提供选择正则化估计量和数据驱动正则化参数的方法指导,通过分析风险与数据生成过程的关系,证明数据驱动选择的风险接近最优选择,并用实例说明。
Abstract Many settings in empirical economics involve estimation of a large number of parameters. In such settings, methods that combine regularized estimation and data-driven choices of regularization parameters are useful. We provide guidance to applied researchers on the choice between regularized estimators and data-driven selection of regularization parameters. We characterize the risk and relative performance of regularized estimators as a function of the data-generating process and show that data-driven choices of regularization parameters yield estimators with risk uniformly close to the risk attained under the optimal (unfeasible) choice of regularization parameters. We illustrate using examples from empirical economics.