Comparing risk adjustment estimation methods under data availability constraints
比较了机器学习和传统回归模型在不同数据粒度与变量范围场景下的预测表现,发现机器学习在数据粗粒度但变量丰富时优势更大,但整体改进较小且统计不显著。
Abstract The Italian National Healthcare Service relies on per capita allocation for healthcare funds, despite having a highly detailed and wide range of data to potentially build a complex risk‐adjustment formula. However, heterogeneity in data availability limits the development of a national model. This paper implements and ealuates machine learning (ML) and standard risk‐adjustment models on different data scenarios that a Region or Country may face, to optimize information with the most predictive model. We show that ML achieves a small but generally statistically insignificant improvement of adjusted R 2 and mean squared error with fine data granularity compared to linear regression, while in coarse granularity and poor range of variables scenario no differences were observed. The advantage of ML algorithms is greater in the coarse granularity and fair/rich range of variables set and limited with fine granularity scenarios. The inclusion of detailed morbidity‐ and pharmacy‐based adjustors generally increases fit, although the trade‐off of creating adverse economic incentives must be considered.