Rethinking the Gold Standard With Multi-armed Bandits: Machine Learning Allocation Algorithms for Experiments
提出用贝叶斯多臂老虎机算法替代传统随机等量分配,通过蒙特卡洛模拟证明其在多数情境下更高效、更符合伦理,并为研究者提供建议。
In experiments, researchers commonly allocate subjects randomly and equally to the different treatment conditions before the experiment starts. While this approach is intuitive, it means that new information gathered during the experiment is not utilized until after the experiment has ended. Based on methodological approaches from other scientific disciplines such as computer science and medicine, we suggest machine learning algorithms for subject allocation in experiments. Specifically, we discuss a Bayesian multi-armed bandit algorithm for randomized controlled trials and use Monte Carlo simulations to compare its efficiency with randomized controlled trials that have a fixed and balanced subject allocation. Our findings indicate that a randomized allocation based on Bayesian multi-armed bandits is more efficient and ethical in most settings. We develop recommendations for researchers and discuss the limitations of our approach.