Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments
研究了因子实验中模型选择对推断的影响,发现忽略交互项的短模型虽能提高统计功效,但可能导致错误结论;基于27篇顶刊论文的分析表明,加入交互项后超过一半的结果失去显著性。
Abstract Factorial designs are widely used to study multiple treatments in one experiment. Although t-tests using a fully saturated “long” model provide valid inferences, “short” model t-tests (that ignore interactions) yield higher power if interactions are zero, but incorrect inferences otherwise. Of 27 factorial experiments published in top-five journals (2007–2017), nineteen use the short model. After including interactions, more than half of their results lose significance. Based on recent econometric advances, we show that power improvements over the long model are possible. We provide practical guidance for the design of new experiments and the analysis of completed experiments.