基于Q学习的离散时间多智能体系统协同输出调节的无模型算法

Model-Free Algorithms for Cooperative Output Regulation of Discrete-Time Multiagent Systems via Q -Learning Method

IEEE Transactions on Cybernetics · 2025

被引 5

ABS 3

Huaguang Zhang
Tianbiao Wang
Dazhong Ma
Lulu Zhang

中文导读

针对系统参数未知的离散时间多智能体系统，提出一种无模型Q学习算法，无需系统参数和求解调节器方程即可直接获得最优策略，并解决了初始策略不稳定时的稳定增益计算问题。

Abstract

This article addresses the cooperative output regulation problem for discrete-time multiagent systems with unknown parameters, a challenge that arises in many practical applications where system models are unavailable. Unlike existing techniques, a model-free Q-learning algorithm is devised to iteratively obtain the optimal policy. This algorithm operates independently of system parameters, and its immediate cost formulation excludes the necessity of solving regulator equations. Consequently, it achieves a streamlined structure, facilitating direct determination of the optimal policy. Subsequently, the stability of each iteration of the algorithm is formally established, along with the derivation of a unique condition for the Q-function matrix. Additionally, to address the challenge of obtaining a stable policy when the initial policy is unstable, an innovative data-driven algorithm is introduced that effectively computes the initial stable gains, ensuring convergence to stability throughout the learning process. Meanwhile, we focus on demonstrating that the distributed observer and the excitation noise do not introduce bias. Finally, the efficacy of the proposed algorithm is validated through two simulation examples.

多智能体系统协同控制强化学习无模型算法

阅读原文 ↗