Optimal activation of halting multi‐armed bandit models
研究了一类新的动态分配问题——停止老虎机模型,并给出了经典Gittins指数分解结果的新证明,对运筹学和机器学习领域有参考价值。
Abstract We study new types of dynamic allocation problems the Halting Bandit models. As an application, we obtain new proofs for the classic Gittins index decomposition result compare Gittins (Journal of the Royal Statistical Society, Series B, 1979, 41, 148–177), and recent results of the authors in Cowan and Katehakis (Probability in the Engineering and Informational Sciences, 2015, 29, 51–76).