Cooperative Finitely Excited Learning for Dynamical Games
针对连续时间零和动态博弈,提出一种协作有限激励学习方法,用在线数据和历史数据结合替代传统持续激励条件,保证系统稳定和纳什均衡收敛。
In this article, we propose a way to enhance the learning framework for zero-sum games with dynamics evolving in continuous time. In contrast to the conventional centralized actor-critic learning, a novel cooperative finitely excited learning approach is developed to combine the online recorded data with instantaneous data for efficiency. By using an experience replay technique for each agent and distributed interaction amongst agents, we are able to replace the classical persistent excitation condition with an easy-to-check cooperative excitation condition. This approach also guarantees the consensus of the distributed actor-critic learning on the solution to the Hamilton-Jacobi-Isaacs (HJI) equation. It is shown that both the closed-loop stability of the equilibrium point and convergence to the Nash equilibrium can be guaranteed. Simulation results demonstrate the efficacy of this approach compared to previous methods.