Stability of Experimental Results: Forecasts and Evidence
研究了实验设计变化对结果稳健性的影响,发现纯复制和人口统计维度下结果高度稳定,但任务和产出度量变化会引入噪声,且专家预测低估了稳定性。
How robust are experimental results to changes in design? And can researchers anticipate which changes matter most? We consider a real-effort task with multiple behavioral treatments and examine the stability along six dimensions: (i) pure replication, (ii) demographics, (iii) geography and culture, (iv) the task, (v) the output measure, and (vi) the presence of a consent form. We find near-perfect replication of the experimental results and full stability of the results across demographics, significantly higher than a group of experts expected. The results differ instead across task and output change, mostly because the task change adds noise to the findings.