At What Level Should One Cluster Standard Errors in Paired and Small-Strata Experiments?
研究发现,在配对实验中,若按随机化单位聚类标准误,t检验会过度拒绝原假设;建议改为按配对层面聚类,该结论也适用于小分层实验。
In matched pairs experiments in which one cluster per pair of clusters is assigned to treatment, to estimate treatment effects, researchers often regress their outcome on a treatment indicator and pair fixed effects, clustering standard errors at the unit-of-randomization level. We show that even if the treatment has no effect, a 5 percent–level t-test based on this regression will wrongly conclude that the treatment has an effect up to 16.5 percent of the time. To fix this problem, researchers should instead cluster standard errors at the pair level. Using simulations, we show that similar results apply to clustered experiments with small strata.