负相关赌博机

Negatively Correlated Bandits

Review of Economic Studies · 2011
被引 10
人大 A+FT50ABS 4*

中文导读

研究两名玩家在连续时间下选择安全或风险臂的博弈,发现当风险臂质量负相关时,学习在阈值以上完全进行,均衡策略为截止策略,且随赌注大小呈现不同均衡特征。

Abstract

We analyze a two-player game of strategic experimentation with two-armed bandits. Each player has to decide in continuous time whether to use a safe arm with a known payoff or a risky arm whose likelihood of delivering payoffs is initially unknown. The quality of the risky arms is perfectly negatively correlated between players. In marked contrast to the case where both risky arms are of the same type, we find that learning will be complete in any Markov perfect equilibrium if the stakes exceed a certain threshold, and that all equilibria are in cutoff strategies. For low stakes, the equilibrium is unique, symmetric, and coincides with the planner's solution. For high stakes, the equilibrium is unique, symmetric, and tantamount to myopic behavior. For intermediate stakes, there is a continuum of equilibria.

负相关风险臂策略实验两臂赌博机马尔可夫完美均衡