Intelligent Critic Learning for Data-Driven Output Tracking Control With Lightweight Parallelization
提出一种数据驱动的并行Q学习算法,通过设计直接关联系统状态的效用函数和双轻量控制器,解决传统折扣方法中的不稳定、误差放大和早熟收敛问题,消除跟踪误差并加速收敛。
This article investigates critical challenges in optimal output tracking control, including residual tracking errors, potential instability caused by discount factors, and premature convergence due to inefficient termination criteria. We tackle these issues by developing a data-driven parallel <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Q</i>-learning algorithm. Specifically, a utility function directly linked to system states is proposed to avoid the instability and error amplification issues in traditional discounted approaches. In addition, the algorithm uses dual lightweight controllers that use convergence properties to enhance learning efficiency. Based on dual controllers, a novel termination criterion is introduced to prevent premature convergence during the training process. Numerical simulations demonstrate that the proposed method eliminates tracking errors, accelerates convergence compared with traditional algorithms, and ensures stable convergence across diverse system dynamics.