基于强化学习的多智能体追逃博弈中的纳什-最小最大策略

Nash–Minmax Strategies for Multiagent Pursuit–Evasion Games With Reinforcement Learning

IEEE Transactions on Cybernetics · 2025
被引 11 · 同刊同年前 10%
ABS 3

中文导读

研究了多智能体追逃博弈中目标捕获问题,提出一种基于离策略强化学习和纳什-最小最大策略的数据驱动最优控制策略,无需先验动力学知识即可实现目标捕获。

Abstract

This article investigates the pursuit-evasion games for target capture in multiagent systems. To address this challenge, a novel data-driven optimal control policy is proposed, leveraging off-policy reinforcement learning and Nash-minmax strategies. First, a comprehensive framework for multiagent pursuit-evasion games is developed, modeled as a two-layer game structure. In this framework, interactions among agents within the same team are characterized as nonzero-sum games, while interactions between opposing teams are adversarial and treated as zero-sum games. Second, Nash-minmax strategies are introduced to solve the formulated multiagent pursuit-evasion games. These strategies effectively derive distributed Nash solutions for agents within the same team and adversarial worst-case policies for agents in opposing teams. Furthermore, to eliminate the reliance on prior knowledge of agent dynamics and initial stabilizing control gains, a data-driven optimal control policy is designed, ensuring the achievement of target capture. Finally, a numerical example is provided to demonstrate the effectiveness and practical applicability of the proposed approach.

多智能体系统追逃博弈强化学习博弈论最优控制