PALAR: Estimation of Absolute Abundance Effects in Regression with Relative Abundance Predictors
针对微生物组相对丰度数据,提出PALAR方法,通过新变换和惩罚回归估计绝对丰度效应,在结直肠癌研究中优于现有方法。
High-dimensional compositional data are ubiquitous in omics research. Microbiome sequencing experiments measure the relative abundances (proportions) of microbial features, while the absolute abundances within the ecosystem remain unobserved. Most regression methods with microbial relative abundance predictors rely on log-ratio transformations and typically impose sparsity on the regression coefficients to manage high dimensionality. However, we show that the sparsity assumption often does not hold for these coefficients. To address this issue, we systematically investigate the relationship between the log‐ratio regression and the absolute abundance regression. Motivated by the connection, we propose PALAR, a simple and efficient approach for estimating sparse absolute abundance effects using penalized regression with predictors derived from a new compositional transformation. We applied PALAR to four microbiome studies on colorectal cancer, demonstrating its advantages over existing methods in consistently identifying disease-relevant microbial species and improving prediction accuracy and generalizability.