The Finite Sample Performance of Inference Methods for Propensity Score Matching and Weighting Estimators
基于德国和美国数据模拟,比较了倾向得分匹配与加权估计量中渐近近似和自助法推断的有限样本表现,发现理论合理的自助法在覆盖率上优于渐近近似。
This article investigates the finite sample properties of a range of inference methods for propensity score-based matching and weighting estimators frequently applied to evaluate the average treatment effect on the treated. We analyze both asymptotic approximations and bootstrap methods for computing variances and confidence intervals in our simulation designs, which are based on German register data and U.S. survey data. We vary the design w.r.t. treatment selectivity, effect heterogeneity, share of treated, and sample size. The results suggest that in general, theoretically justified bootstrap procedures (i.e., wild bootstrapping for pair matching and standard bootstrapping for “smoother” treatment effect estimators) dominate the asymptotic approximations in terms of coverage rates for both matching and weighting estimators. Most findings are robust across simulation designs and estimators.