高维协变量观测研究中的重叠问题

Overlap in observational studies with high-dimensional covariates

Journal of Econometrics · 2020
被引 107 · 同刊同年前 6%
人大 AABS 4

中文导读

探讨高维协变量下观测研究中重叠假设的严格性,利用信息论推导出严格重叠对协变量均值不平衡的界限,并指出该假设随维度增加而更严格,对稀疏性和修剪等常用方法有重要影响。

Abstract

Estimating causal effects under exogeneity hinges on two key assumptions: unconfoundedness and overlap. Researchers often argue that unconfoundedness is more plausible when more covariates are included in the analysis. Less discussed is the fact that covariate overlap is more difficult to satisfy in this setting. In this paper, we explore the implications of overlap in observational studies with high-dimensional covariates and formalize curse-of-dimensionality argument, suggesting that these assumptions are stronger than investigators likely realize. Our key innovation is to explore how strict overlap restricts global discrepancies between the covariate distributions in the treated and control populations. Exploiting results from information theory, we derive explicit bounds on the average imbalance in covariate means under strict overlap and show that these bounds become more restrictive as the dimension grows large. We discuss how these implications interact with assumptions and procedures commonly deployed in observational causal inference, including sparsity and trimming.

重叠假设高维协变量因果推断维数灾难