Incorporating external risk information with the Cox model under population heterogeneity: applications to trans-ancestry polygenic hazard scores
提出一种基于KL散度的Cox模型(CoxKL),将外部欧洲祖先的多基因风险评分与内部非欧洲人群的个体数据整合,在考虑人群异质性的前提下提升风险区分能力,并应用于前列腺癌和乳腺癌的跨祖先风险预测。
Abstract Polygenic hazard scores (PHS) designed for European ancestry (EUR) individuals provide ample information regarding risk discrimination. Incorporating such information can improve the performance of risk discrimination in the target small-sized non-EUR cohort. However, given that external EUR-based PHS and internal individual-level data come from different populations, ignoring population heterogeneity can introduce substantial bias. In this article, we develop a Kullback–Leibler-based Cox model (CoxKL) to integrate internal individual-level time-to-event data with external risk scores derived from published prediction models. Partial-likelihood-based KL information is utilized to measure the discrepancy between the external risk information and the internal data. We establish the asymptotic properties of the CoxKL estimator. Simulation studies show that the integration model by the proposed CoxKL method achieves improved estimation efficiency and prediction accuracy. We apply the proposed method to develop trans-ancestry PHS models for prostate cancer and breast cancer and find that integrating EUR-based PHS with internal genotype data of African ancestry individuals yields considerable improvement on the cancer risk discrimination.