Targeted principal components regression
提出一种利用响应变量和预测变量联合选择线性组合的主成分回归方法,解决了传统主成分回归忽略响应变量的缺陷,在多种设定下估计量一致且渐近更有效,模拟和股票收益预测中表现优于传统方法。
We propose a principal components regression method based on maximizing a joint pseudo-likelihood for responses and predictors. Our method uses both responses and predictors to select linear combinations of the predictors relevant for the regression, thereby addressing an oft-cited deficiency of conventional principal components regression. The proposed estimator is shown to be consistent in a wide range of settings, including ones with non-normal and dependent observations; conditions on the first and second moments suffice if the number of predictors (p) is fixed, the number of observations (n) tends to infinity, and dependence is weak, while stronger distributional assumptions are needed when p→∞ with n. We obtain the estimator’s asymptotic distribution as the projection of a multivariate normal random vector onto a tangent cone of the parameter set at the true parameter, and find the estimator is asymptotically more efficient than competing ones. In simulations our method is substantially more accurate than conventional principal components regression and compares favorably to partial least squares and predictor envelopes. The method’s practical usefulness is illustrated in a data example with cross-sectional prediction of stock returns.