🌙

CARE:成分数据的大精度矩阵估计

CARE: Large Precision Matrix Estimation for Compositional Data

Journal of the American Statistical Association · 2024
被引 7
ABS 4

中文导读

针对成分数据中变量间条件依赖关系推断的难题,提出成分自适应正则化估计(CARE)方法,在稀疏假设下估计基础精度矩阵,并证明高维时达到最优,适用于微生物网络等应用。

Abstract

High-dimensional compositional data are prevalent in many applications. The simplex constraint poses intrinsic challenges to inferring the conditional dependence relationships among the components forming a composition, as encoded by a large precision matrix. We introduce a precise specification of the compositional precision matrix and relate it to its basis counterpart, which is shown to be asymptotically identifiable under suitable sparsity assumptions. By exploiting this connection, we propose a composition adaptive regularized estimation (CARE) method for estimating the sparse basis precision matrix. We derive rates of convergence for the estimator and provide theoretical guarantees on support recovery and data-driven parameter tuning. Our theory reveals an intriguing trade-off between identification and estimation, thereby highlighting the blessing of dimensionality in compositional data analysis. In particular, in sufficiently high dimensions, the CARE estimator achieves minimax optimality and performs as well as if the basis were observed. We further discuss how our framework can be extended to handle data containing zeros, including sampling zeros and structural zeros. The advantages of CARE over existing methods are illustrated by simulation studies and an application to inferring microbial ecological networks in the human gut.

高维统计成分数据分析精度矩阵估计微生物生态网络