Algorithms for Stochastic Games With Perfect Monitoring
研究了完美监测随机博弈中纯策略子博弈完美纳什均衡,结合策略迭代与值迭代开发了高效计算均衡收益的算法,并提供了软件实现。
We study the pure‐strategy subgame‐perfect Nash equilibria of stochastic games with perfect monitoring, geometric discounting, and public randomization. We develop novel algorithms for computing equilibrium payoffs, in which we combine policy iteration when incentive constraints are slack with value iteration when incentive constraints bind. We also provide software implementations of our algorithms. Preliminary simulations indicate that they are significantly more efficient than existing methods. The theoretical results that underlie the algorithms also imply bounds on the computational complexity of equilibrium payoffs when there are two players. When there are more than two players, we show by example that the number of extreme equilibrium payoffs may be countably infinite.