因果推断中双重稳健且异方差感知的样本修剪

Doubly robust and heteroscedasticity-aware sample trimming for causal inference

Biometrika · 2024

被引 1

ABS 4

Samir Khan
Johan Ugander

中文导读

针对异方差数据和非参数模型，提出新的样本修剪方法，包括修剪极端条件方差单元，并给出修剪后推断的理论保证，通过模拟和实际数据验证效果。

Abstract

Summary A popular method for variance reduction in causal inference is propensity-based trimming, the practice of removing units with extreme propensities from the sample. This practice has theoretical grounding when the data are homoscedastic and the propensity model is parametric (Crump et al., 2009; Yang & Ding, 2018), but in modern settings where heteroscedastic data are analysed with nonparametric models, existing theory fails to support current practice. In this work, we address this challenge by developing new methods and theory for sample trimming. Our contributions are three-fold. First, we describe novel procedures for selecting which units to trim. Our procedures differ from previous works in that we trim, not only units with small propensities, but also units with extreme conditional variances. Second, we give new theoretical guarantees for inference after trimming. In particular, we show how to perform inference on the trimmed subpopulation without requiring that our regressions converge at parametric rates. Instead, we make only fourth-root rate assumptions like those in the double machine learning literature. This result applies to conventional propensity-based trimming as well, and thus may be of independent interest. Finally, we propose a bootstrap-based method for constructing simultaneously valid confidence intervals for multiple trimmed subpopulations, which are valuable for navigating the trade-off between sample size and variance reduction inherent in trimming. We validate our methods in simulation, on the 2007–2008 National Health and Nutrition Examination Survey and on a semisynthetic Medicare dataset, and find promising results in all settings.

因果推断计量经济学统计学机器学习

阅读原文 ↗