可能所有协变量都高度相关时的线性回归变量选择

Variable selection in linear regressions with possibly all strongly correlated covariates

Econometric Reviews · 2025
被引 0
人大 A-ABS 3

中文导读

针对LASSO和OCMT方法在协变量全相关时失效的问题,提出广义OCMT方法,能一致选出所有信号变量并剔除噪声,在资产定价风险因子选择中表现良好。

Abstract

.Penalized regression methods, particularly LASSO, require the irrepresentable condition for variable selection consistency. This condition places an upper bound on the magnitudes of correlation between the signals and the rest of the covariates under consideration. Chudik, Kapetanios, and Pesaran (Citation2018) proposed an alternative procedure called the one covariate at a time multiple testing (OCMT) that does not impose restrictions on the magnitude of these correlations. However, it requires that the number of covariates correlated with the signals grows at a rate less than the square root of the number of observations, denoted by T. Notably, the required conditions for both LASSO and OCMT can be challenged when possibly all the covariates under consideration are correlated with the signals. In this article, we follow the ideas from latent factor literature to adapt OCMT to allow for such scenarios. We refer to our proposed method as generalized one covariate at a time multiple testing (GOCMT). We establish that GOCMT selects a model that contains all the signals and none of the noise variables asymptotically. We also show that the least squares estimator of the post GOCMT selected model is T-consistent. The proposed method demonstrates promising finite-sample performance in our Monte Carlo experiments. An empirical application in risk factor selection within asset pricing further underscores the utility of our method.

变量选择强相关协变量OCMT方法GOCMT方法