🌙

具有稳定性保证的离线和在线自适应评判控制设计:基于值迭代

Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration

IEEE Transactions on Cybernetics · 2021
被引 62
ABS 3

中文导读

研究了值迭代生成的控制策略对闭环系统稳定性的影响,提出了离线集成值迭代方案和在线自适应动态规划算法,确保状态轨迹收敛到原点,并在线性系统中允许有限个不稳定策略。

Abstract

This article is concerned with the stability of the closed-loop system using various control policies generated by value iteration. Some stability properties involving admissibility criteria, the attraction domain, and so forth, are investigated. An offline integrated value iteration (VI) scheme with a stability guarantee is developed by combining the advantages of VI and policy iteration, which is convenient to obtain admissible control policies. Also, based on the concept of attraction domain, an online adaptive dynamic programming algorithm using immature control policies is developed. Remarkably, it is ensured that the state trajectory under the online algorithm converges to the origin. Particularly, for linear systems, the online ADP algorithm with a general scheme possesses more enhanced stability property. The theoretical results reveal that the stability of the linear system can be guaranteed even if the control policy sequence includes finite unstable elements. The numerical results verify the effectiveness of the present algorithms.

自适应控制最优控制值迭代稳定性分析自适应动态规划