两臂赌博机问题中优势比的估计

Estimation of the Odds Ratio in the Two-Armed Bandit Problem

Biometrika · 1981

被引 0

ABS 4

Lakhbir S. Hayre
Bruce W. Turnbull

中文导读

针对两个伯努利总体的对数优势比，提出了固定宽度区间估计和点估计的渐近最优序贯方法，允许两总体观测成本不同且可能依赖于成功概率。通过模拟研究了两种成本结构下的区间估计，并给出了自适应抽样相对于成对抽样的成本节省近似表达式。

Abstract

Asymptotically optimal sequential procedures are proposed for fixed width interval and for point estimation of the log odds ratio for two Bernoulli populations. The costs of observations can be different for the two populations and possibly dependent on the success probabilities. The interval estimation procedure is studied by simulation for two sampling cost structures of particular interest, namely when the goal is to minimize the total average sample size, and when the goal is to minimize the total expected number of failures before termination. Approximate expressions given for the savings in sampling cost using adaptive rather than pairwise sampling show that such savings can be substantial in some cases. In the final section, the multiarmed bandit problem is considered.

统计学序贯分析区间估计点估计自适应抽样

阅读原文 ↗