🌙

多计数表数据中特征交互的稀疏贝叶斯群因子模型

Sparse Bayesian Group Factor Model for Feature Interactions in Multiple Count Tables Data

Journal of the American Statistical Association · 2025
被引 0
ABS 4

中文导读

针对多域微生物组计数数据,提出稀疏贝叶斯群因子模型,用狄利克雷-马蹄先验实现联合稀疏性,灵活处理过度离散和零膨胀,估计微生物交互与协变量效应。

Abstract

Group factor models have been developed to infer relationships between multiple co-occurring multivariate continuous responses. Motivated by complex count data from multi-domain microbiome studies using next-generation sequencing, we develop a sparse Bayesian group factor model (Sp-BGFM) for multiple count table data that captures the interaction between microorganisms in different domains. Sp-BGFM uses a rounded kernel mixture model using a Dirichlet process (DP) prior with log-normal mixture kernels for count vectors. A group factor model is used to model the covariance matrix of the mixing kernel that describes microorganism interaction. We construct a Dirichlet-Horseshoe (Dir-HS) shrinkage prior and use it as a joint prior for factor loading vectors. Joint sparsity induced by a Dir-HS prior greatly improves the performance in high-dimensional applications. We further model the effects of covariates on microbial abundances using regression. The semiparametric model flexibly accommodates large variability in observed counts and excess zero counts and provides a basis for robust estimation of the interaction and covariate effects. We evaluate Sp-BGFM using simulation studies and real data analysis, comparing it to popular alternatives. Our results highlight the necessity of joint sparsity induced by the Dir-HS prior, and the benefits of a flexible DP model for baseline abundances.

贝叶斯统计因子分析微生物组学高维数据分析计数数据建模