基于稀疏树的微生物组数据聚类方法用于表征胰腺癌中的微生物组异质性

Sparse tree-based clustering of microbiome data to characterize microbiome heterogeneity in pancreatic cancer

Journal of the Royal Statistical Society. Series C: Applied Statistics · 2023
被引 6
ABS 3

中文导读

提出一种贝叶斯框架下的无监督聚类方法,通过特征选择、自动确定聚类数并整合特征间的树结构信息,发现具有相似微生物组特征的胰腺癌患者亚组,有助于理解微生物组对治疗结果的影响。

Abstract

There is a keen interest in characterizing variation in the microbiome across cancer patients, given increasing evidence of its important role in determining treatment outcomes. Here our goal is to discover subgroups of patients with similar microbiome profiles. We propose a novel unsupervised clustering approach in the Bayesian framework that innovates over existing model-based clustering approaches, such as the Dirichlet multinomial mixture model, in three key respects: we incorporate feature selection, learn the appropriate number of clusters from the data, and integrate information on the tree structure relating the observed features. We compare the performance of our proposed method to existing methods on simulated data designed to mimic real microbiome data. We then illustrate results obtained for our motivating data set, a clinical study aimed at characterizing the tumor microbiome of pancreatic cancer patients.

微生物组聚类分析胰腺癌贝叶斯方法生物信息学