Reducible Markov Decision Processes and Stochastic Games
提出可约马尔可夫决策过程,通过求解更简单的坐标MDP来获得精确解,实现维度约简,并扩展到随机博弈,给出纯策略马尔可夫完美均衡的存在条件与闭式解,适用于产能、库存管理和双寡头竞争等场景。
Markov decision processes (MDPs) provide a powerful framework for analyzing dynamic decision making. However, their applications are significantly hindered by the difficulty of obtaining solutions. In this study, we introduce reducible MDPs whose exact solutions can be obtained by solving simpler MDPs, termed the coordinate MDPs. The value function and an optimal policy of a reducible MDP are linear functions of those of the associated coordinate MDP. Because the coordinate MDP does not involve the multi‐dimensional endogenous state, we achieve dimension reduction on a reducible MDP. Extending the MDP framework to multiple players, we introduce reducible stochastic games. We show that these games reduce to simpler coordinate games that do not involve the multi‐dimensional endogenous state. We specify sufficient conditions for the existence of a pure‐strategy Markov perfect equilibrium in reducible stochastic games and derive closed‐form expressions for the players’ equilibrium values. The reducible framework encompasses a variety of linear and nonlinear models and offers substantial simplification in analysis and computation. We provide guidelines and illustrative examples on formulating problems as reducible models. We demonstrate the applicability and modeling flexibility of reducible models in a wide range of contexts including capacity and inventory management and duopoly competition.