Strategic Experimentation with Exponential Bandits
分析双臂老虎机策略性实验博弈,其中风险臂在指数分布随机时间后可能产生收益。研究发现搭便车导致实验水平低效,并构建了对称及非对称马尔可夫均衡,探讨不同切换策略下的信息获取效率。
We analyze a game of strategic experimentation with two-armed bandits whose risky arm might yield payoffs after exponentially distributed random times. Free-riding causes an inefficiently low level of experimentation in any equilibrium where the players use stationary Markovian strategies with beliefs as the state variable. We construct the unique symmetric Markovian equilibrium of the game, followed by various asymmetric ones. There is no equilibrium where all players use simple cut-off strategies. Equilibria where players switch finitely often between experimenting and free-riding all yield a similar pattern of information acquisition, greater efficiency being achieved when the players share the burden of experimentation more equitably. When players switch roles infinitely often, they can acquire an approximately efficient amount of information, but still at an inefficient rate. In terms of aggregate payoffs, all these asymmetric equilibria dominate the symmetric one wherever the latter prescribes simultaneous use of both arms. Copyright The Econometric Society 2005.