通过跨维度马尔可夫链蒙特卡洛和隐马尔可夫模型揭示癌症表观遗传学中的改变

Uncovering alterations in cancer epigenetics via trans-dimensional Markov chain Monte Carlo and hidden Markov models

Journal of the Royal Statistical Society. Series C: Applied Statistics · 2025
被引 0
ABS 3

中文导读

提出一种名为DMCTHM的新方法,结合跨维度马尔可夫链蒙特卡洛和隐马尔可夫模型,从亚硫酸氢盐测序数据中识别癌症样本的差异甲基化位点,在结直肠癌中发现了与Tp53通路相关的新位点和基因。

Abstract

Abstract Epigenetic alterations are key drivers in the development and progression of cancer. Identifying differentially methylated cytosines (DMCs) in cancer samples is a crucial step toward understanding these changes. In this paper, we propose a trans-dimensional Markov chain Monte Carlo (TMCMC) approach that uses hidden Markov models (HMMs) with binomial emission, and bisulfite sequencing (BS-Seq) data, called DMCTHM, to identify DMCs in cancer epigenetic studies. We introduce the Expander-Collider penalty to tackle under and over-estimation in TMCMC-HMMs. We address all known challenges inherent in BS-Seq data by introducing novel approaches for capturing functional patterns and autocorrelation structure of the data, as well as for handling missing values, multiple covariates, multiple comparisons, and family-wise errors. We demonstrate the effectiveness of DMCTHM through comprehensive simulation studies. The results show that our proposed method outperforms other competing methods in identifying DMCs. Notably, with DMCTHM, we uncovered new DMCs and genes in Colorectal cancer that were significantly enriched in the Tp53 pathway.

癌症表观遗传学隐马尔可夫模型差异甲基化位点识别生物信息学方法