基于差异协变量分类的预测：以医学风险评估中的协变量分类为例

Prediction with Differential Covariate Classification: Illustrated by Covariate Classification in Medical Risk Assessment

Journal of Business & Economic Statistics · 2026

被引 0 · 同刊同年前 2%

人大 AABS 4

Atheendar S. Venkataramani · 宾夕法尼亚大学通讯
Charles F. Manski · 西北大学（美国）
John Mullahy · 威斯康星大学麦迪逊分校

中文导读

研究了研究证据中协变量分类与决策场景中不一致时，如何识别和预测条件概率P(y|x)，发现边界可能很宽且缩小所需信息仅在特殊情况下可得，对医学和政策决策有警示意义。

Abstract

A common practice in evidence-based decision-making uses estimates of conditional probabilities P(y|x) obtained from research studies to predict outcomes y based on observed covariates x. Given this information, decisions are then based on the predicted outcomes. Researchers commonly assume that the predictors used in the generation of the evidence are the same as those used in applying the evidence: i.e., the meaning of x in the two circumstances is the same. This may not be the case in real-world settings. Across a wide range of settings, ranging from clinical practice to education policy, demographic attributes (e.g., age, race, ethnicity) are often classified differently in research studies than in decision settings. This paper studies identification in such settings. We propose a formal framework for prediction with what we term differential covariate classification (DCC). Using this framework, we analyze partial identification of probabilistic predictions and assess how various assumptions influence the identification regions. We apply the findings to a range of settings, focusing mainly on differential classification of individuals' race and ethnicity in clinical medicine. We find that bounds on P(y|x) can be wide, and the information needed to narrow them is available only in special cases. These findings highlight an important problem in using evidence in decision making, a problem that has not yet been fully appreciated in debates on classification in public policy and medicine.

差分协变量分类部分识别概率预测医学风险评估

阅读原文 ↗