A Simple Adaptive Procedure Leading to Correlated Equilibrium
提出一种名为“遗憾匹配”的简单自适应博弈过程,玩家以与过去未用其他策略的遗憾程度成比例的概率改变当前策略,并证明该过程几乎必然使经验分布收敛到相关均衡集。
We propose a new and simple adaptive procedure for playing a game: ‘‘regret-matching.’’ In this procedure, players may depart from their current play with probabilities that are proportional to measures of regret for not having used other strategies in the past. It is shown that our adaptive procedure guarantees that, with probability one, the empirical distributions of play converge to the set of correlated equilibria of the game.