Distributed Formation Control for Underactuated Multi-ASVs Under DoS Attacks Using Value Decomposition Reinforcement Learning
研究了欠驱动多自主水面艇在通信网络遭受拒绝服务攻击和复杂不确定性下的编队控制问题,提出基于值分解强化学习的分布式自适应控制器,实现目标状态估计和编队误差收敛。
This article addresses the formation control problem of underactuated multi-autonomous surface vehicles (ASVs) under denial-of-service (DoS) attacks on the communication network and complex uncertainties, including unknown ASV dynamics, external disturbances, and obstacles. First, a novel distributed target estimator (DTE) using a first-order low-pass filter is designed to estimate the state of the target based on partial observability under target information constraints and DoS attacks. Second, a safe guidance law for the ASVs is developed using the estimated target state and a control barrier function. Third, a “dual-adaptation control and learning” bidirectional fusion model is constructed. Specifically, on one hand, a distributed adaptive formation controller based on value decomposition reinforcement learning (VDRL) is designed. By decomposing the global value function, this controller addresses the credit allocation issue in cooperative formation and allows the ASVs to adjust their strategies autonomously based on the environment. On the other hand, an adaptive mechanism is used to improve the computation strategy of the global value function in VDRL, enhancing the learning efficiency and stability of the reinforcement learning (RL) algorithm in complex environments. The proposed VDRL-based formation control algorithm of the underactuated multi-ASVs ensures accurate target state estimation and convergence of formation errors under DoS conditions. A rigorous theoretical analysis is further used to ensure the closed-loop stability of the multi-ASV systems. Finally, simulation results validate the effectiveness of the proposed distributed formation control algorithm.