Uncovering alterations in cancer epigenetics via trans-dimensional Markov chain Monte Carlo and hidden Markov models
提出一种名为DMCTHM的新方法,结合跨维度马尔可夫链蒙特卡洛和隐马尔可夫模型,从亚硫酸氢盐测序数据中识别癌症样本的差异甲基化位点,在结直肠癌中发现了与Tp53通路相关的新位点和基因。
Abstract Epigenetic alterations are key drivers in the development and progression of cancer. Identifying differentially methylated cytosines (DMCs) in cancer samples is a crucial step toward understanding these changes. In this paper, we propose a trans-dimensional Markov chain Monte Carlo (TMCMC) approach that uses hidden Markov models (HMMs) with binomial emission, and bisulfite sequencing (BS-Seq) data, called DMCTHM, to identify DMCs in cancer epigenetic studies. We introduce the Expander-Collider penalty to tackle under and over-estimation in TMCMC-HMMs. We address all known challenges inherent in BS-Seq data by introducing novel approaches for capturing functional patterns and autocorrelation structure of the data, as well as for handling missing values, multiple covariates, multiple comparisons, and family-wise errors. We demonstrate the effectiveness of DMCTHM through comprehensive simulation studies. The results show that our proposed method outperforms other competing methods in identifying DMCs. Notably, with DMCTHM, we uncovered new DMCs and genes in Colorectal cancer that were significantly enriched in the Tp53 pathway.