面向自适应最优一致性控制的在线强化学习算法设计:基于区间激励条件

Online Reinforcement Learning Algorithm Design for Adaptive Optimal Consensus Control Under Interval Excitation

IEEE Transactions on Systems, Man, and Cybernetics: Systems · 2025
被引 0
ABS 3

中文导读

提出一种在线数据驱动的强化学习算法,用于异构多智能体系统在未知动态下的自适应输出一致性控制,通过区间激励条件替代传统持续激励条件,降低学习条件限制。

Abstract

This article proposes online data-based reinforcement learning (RL) algorithm for adaptive output consensus control of heterogeneous multiagent systems (MASs) with unknown dynamics. First, we employ the adaptive control technique to design a distributed observer, which provides an estimation of the leader for partial agents, thereby eliminating the need for the global information. Then, we propose a novel data-based adaptive dynamic programming (ADP) approach, associated with a double-integrator operator, to develop an online data-driven learning algorithm for learning the optimal control policy. However, existing optimal control strategy learning algorithms rely on the persistent excitation conditions (PECs), the full-rank condition, and the offline storage of historical data. To address these issues, our proposed method learns the optimal control policy online by solving a data-driven linear regression equations (LREs) based on an online-verifiable interval excitation (IE) condition, instead of relying on PEC. In addition, the uniqueness of the LRE solution is established by verifying the invertibility of a matrix, instead of satisfying the full-rank condition related to PEC and historical data storage as required in existing algorithms. It is demonstrated that our proposed learning algorithm not only guarantees optimal tracking with unknown dynamics but also relaxes some of the strict conditions of existing learning algorithms. Finally, a numerical example is provided to validate the effectiveness and performance of the proposed algorithms.

强化学习多智能体系统自适应控制最优控制分布式控制