Two Competing Models of How People Learn in Games
比较强化学习与随机虚拟博弈两种人类学习模型,发现它们比预想的更相似,预期运动都可写成进化复制者动力学的扰动形式,主要区别在于学习速度。
Reinforcement learning and stochastic fictitious play are apparent rivals as models of human learning. They embody quite different assumptions about the processing of information and optimization. This paper compares their properties and finds that they are far more similar than were thought. In particular, the expected motion of stochastic fictitious play and reinforcement learning with experimentation can both be written as a perturbed form of the evolutionary replicator dynamics. Therefore they will in many cases have the same asymptotic behavior. In particular, local stability of mixed equilibria under stochastic fictitious play implies local stability under perturbed reinforcement learning. The main identifiable difference between the two models is speed: stochastic fictitious play gives rise to faster learning.