人类在博弈中学习的两种竞争模型

Two Competing Models of How People Learn in Games

Econometrica · 2002

被引 179

人大 A+FT50ABS 4*

Ed Hopkins · 爱丁堡大学通讯

中文导读

比较强化学习与随机虚拟博弈两种人类学习模型，发现它们比预想的更相似，预期运动都可写成进化复制者动力学的扰动形式，主要区别在于学习速度。

Abstract

Reinforcement learning and stochastic fictitious play are apparent rivals as models of human learning. They embody quite different assumptions about the processing of information and optimization. This paper compares their properties and finds that they are far more similar than were thought. In particular, the expected motion of stochastic fictitious play and reinforcement learning with experimentation can both be written as a perturbed form of the evolutionary replicator dynamics. Therefore they will in many cases have the same asymptotic behavior. In particular, local stability of mixed equilibria under stochastic fictitious play implies local stability under perturbed reinforcement learning. The main identifiable difference between the two models is speed: stochastic fictitious play gives rise to faster learning.

强化学习随机虚拟博弈复制动态混合均衡稳定性

阅读原文 ↗