Subtype-Aware Registration of Longitudinal Electronic Health Records
针对电子健康记录中疾病记录时间与实际发病时间错位的问题,提出一种亚型感知的时间对齐方法,利用数据投影和离散优化校正错位,提升疾病亚型识别和临床预测的准确性。
Electronic Health Records (EHRs) contain extensive patient information that can inform downstream clinical decisions, such as mortality prediction, disease phenotyping, and disease onset prediction. A key challenge in EHR data analysis is the temporal gap between when a condition is first recorded and its actual onset time. Such timeline misalignment can lead to artificially distinct biomarker trends among patients with similar disease progression, undermining the reliability of downstream analyses and complicating tasks such as disease subtyping and outcome prediction. To address this challenge, we provide a subtype-aware timeline registration method that leverages data projection and discrete optimization to correct timeline misalignment. Through simulation and real-world data analyses, we demonstrate that the proposed method effectively aligns distorted observed records with the true disease progression patterns, enhancing subtyping clarity and improving performance in downstream clinical analyses. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.