多人微分博弈的记忆高效逆强化学习

Memory-Efficient Inverse Reinforcement Learning for Multiplayer Differential Games

IEEE Transactions on Cybernetics · 2025

被引 1

ABS 3

Jiacheng Wu
Yang Zhu
Hongye Su

中文导读

提出一种无需数据存储和持续激励条件的记忆高效逆强化学习算法，用于多人微分博弈中从专家演示推断未知代价函数，并设计滤波同伦算法解决初始可行控制策略获取难题。

Abstract

Data-driven inverse reinforcement learning (RL) control aims to infer the unknown cost function of a learner system from expert demonstrations. The convergence of existing methods necessitates a data storage mechanism to maintain persistent excitation (PE), which consumes memory and induces delays in satisfying full-rank conditions. To address these problems, in this article, we propose a novel memory-efficient inverse RL algorithm for multiplayer differential game that eliminates the need for strict PE and data storage. We prove that Nash equilibrium solutions for the learner system can be guaranteed under a mild initial excitation condition. Besides, existing inverse RL control algorithms often rely on an initial admissible control policy (IACP), which is difficult to obtain in data-driven scenarios. We address this problem by designing a novel filter-based homotopic RL algorithm, which derives an IACP for learner systems by shifting unstable poles into a stable region. Moreover, we establish several properties of the designed algorithms, including convergence, nonuniqueness, and stability. Finally, the effectiveness of the proposed algorithms is verified by comparative studies and simulation results.

强化学习逆强化学习微分博弈控制理论人工智能

阅读原文 ↗