大维度下可解释的稀疏近似因子

Interpretable Sparse Proximate Factors for Large Dimensions

Journal of Business & Economic Statistics · 2021
被引 25
人大 AABS 4

中文导读

提出一种稀疏且易于解释的近似因子,仅用5%-10%的数据就能几乎完美复制主成分分析因子,平均相关性约97.5%,适用于金融和宏观经济数据分析。

Abstract

This article proposes sparse and easy-to-interpret proximate factors to approximate statistical latent factors. Latent factors in a large-dimensional factor model can be estimated by principal component analysis (PCA), but are usually hard to interpret. We obtain proximate factors that are easier to interpret by shrinking the PCA factor weights and setting them to zero except for the largest absolute ones. We show that proximate factors constructed with only 5%–10% of the data are usually sufficient to almost perfectly replicate the population and PCA factors without actually assuming a sparse structure in the weights or loadings. Using extreme value theory we explain why sparse proximate factors can be substitutes for non-sparse PCA factors. We derive analytical asymptotic bounds for the correlation of appropriately rotated proximate factors with the population factors. These bounds provide guidance on how to construct the proximate factors. In simulations and empirical analyses of financial portfolio and macroeconomic data, we illustrate that sparse proximate factors are close substitutes for PCA factors with average correlations of around 97.5%, while being interpretable.

稀疏近似因子主成分分析因子解释高维因子模型