高维控制变量选择后处理效应的推断

Inference on Treatment Effects after Selection among High-Dimensional Controls

Review of Economic Studies · 2013
被引 1420 · 同刊同年前 2%
人大 A+FT50ABS 4*

中文导读

提出一种“后双重选择”方法,在存在大量控制变量且模型近似稀疏时,对处理效应进行稳健推断,解决模型选择后推断的均匀性问题。

Abstract

We propose robust methods for inference about the effect of a treatment variable on a scalar outcome in the presence of very many regressors in a model with possibly non-Gaussian and heteroscedastic disturbances. We allow for the number of regressors to be larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by including a relatively small number of variables whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of regressors. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the “post-double-selection†method. The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus, our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We also present a generalization of our method to a fully heterogeneous model with a binary treatment variable. We illustrate the use of the developed methods with numerical simulations and an application that considers the effect of abortion on crime rates.

高维控制变量处理效应后双选择方法稳健推断