Many covariate and cluster robust estimation and inference
针对协变量数量多、数据存在聚类依赖的线性回归模型,提出留聚类交叉拟合方差估计方法,适用于大协变量和异方差场景,并通过模拟和实证案例验证其有效性。
Empirical economists often use regression models employing large sets of covariates and presuming clustered data dependence. We provide inference methods for linear regressions with covariates whose number may be comparable to sample size and observations that are clustered into possibly heterogeneous clusters. We present a leave-cluster-out-crossfit (LCOC) method of constructing an OLS asymptotic variance estimator, which extends leave-one-out variance estimation for independent data to clustered data and which is robust to many covariates and heteroskedasticity. We show consistency of the LCOC estimator and asymptotic normality of the standardized OLS estimator. We demonstrate finite-sample properties of LCOC in simulations in comparison with available alternatives. Finally, we provide two empirical illustrations, where LCOC is applied to existing studies of the effects of high school achievement awards and the impact of legalized abortion on crime reduction.