Strategic Experimentation
将经典的双臂赌博机问题扩展到多人场景,研究N个玩家面对相同实验问题时,信息作为公共品导致的搭便车效应和鼓励效应,并分析了稳态马尔可夫均衡集合。
This paper extends the classic two-armed bandit problem to a many-agent setting in which N players each face the same experimentation problem. The main change from the single-agent problem is that an agent can now learn from the current experimentation of other agents. Information is therefore a public good, and a free-rider problem in experimentation naturally arises. More interestingly, the prospect of future experimentation by others encourages agents to increase current experimentation, in order to bring forward the time at which the extra information generated by such experimentation becomes available. The paper provides an analysis of the set of stationary Markov equilibria in terms of the free-rider effect and the encouragement effect.