🌙

从混合表达数据推断细胞类型比例的统计推断

Statistical Inference of Cell-Type Proportions Estimated from Bulk Expression Data

Journal of the American Statistical Association · 2024
被引 3
ABS 4

中文导读

提出一种灵活的去卷积统计框架DECALS,能估计混合样本中细胞类型比例及其不确定性,帮助更准确识别不同组间的差异表达基因,适用于阿尔茨海默病等研究。

Abstract

There is a growing interest in cell-type-specific analysis from bulk samples with a mixture of different cell types. A critical first step in such analyses is the accurate estimation of cell-type proportions in a bulk sample. Although many methods have been proposed recently, quantifying the uncertainties associated with the estimated cell-type proportions has not been well studied. Lack of consideration of these uncertainties can lead to missed or false findings in downstream analyses. In this article, we introduce a flexible statistical deconvolution framework that allows a general and subject-specific covariance of bulk gene expressions. Under this framework, we propose a decorrelated constrained least squares method called DECALS that estimates cell-type proportions as well as the sampling distribution of the estimates. Simulation studies demonstrate that DECALS can accurately quantify the uncertainties in the estimated proportions whereas other methods fail. Applying DECALS to analyze bulk gene expression data of post mortem brain samples from the ROSMAP and GTEx projects, we show that taking into account the uncertainties in the estimated cell-type proportions can lead to more accurate identifications of cell-type-specific differentially expressed genes and transcripts between different subject groups, such as between Alzheimer's disease patients and controls and between males and females.

统计推断细胞类型比例估计基因表达数据生物信息学