Reinforcement Learning Controller Design for Discrete-Time-Constrained Nonlinear Systems With Weight Initialization Method
提出一种结合控制屏障函数和非二次损失函数的强化学习控制器设计方法,利用非线性模型预测控制初始化网络权重,使智能体在满足系统约束下安全高效地学习最优控制器。
Extensive research has been dedicated to reinforcement learning (RL) for acquiring proficient optimal controllers through interactions with the environment. However, real-world demands, including enhanced safety performance, introduce considerable challenges to the present design of optimal controllers rooted in RL algorithms. A novel approach is introduced in this article for designing RL-based optimal controllers, employing a control barrier function (CBF) alongside a nonquadratic loss function related to the control signal. The aim is to enable the agent to learn the optimal controller in a secure and efficient manner. To tackle the instability issue in neural network training inherent to traditional RL-based controller design processes, the nonlinear model predictive control (NMPC) technique is employed for initializing the controller network’s weights. A formal demonstration of the method’s optimality is presented. Numerical simulations validate the proposed approach, illustrating its capacity to effectively learn the optimal controller while adhering to the input and state constraints of the system.