🌙

自适应动态规划的连续时间随机策略迭代

Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming

IEEE Transactions on Systems, Man, and Cybernetics: Systems · 2023
被引 41
ABS 3

中文导读

针对带随机扰动的连续时间非线性系统,提出一种新的随机自适应动态规划方法,通过条件期望同时逼近值函数和控制律,并证明了闭环系统的渐近稳定性与算法收敛性。

Abstract

In this article, we study the optimal control problem of continuous-time (CT) time-invariant nonlinear systems with stochastic nonlinear disturbances. A new stochastic adaptive dynamic programming (ADP) method is developed to solve the Hamilton–Jacobi–Bellman equation (HJBE). Under the conditional expectation, the value function and the control law are successively approximated simultaneously. The asymptotic stability of the closed-loop stochastic system in probability is analyzed by the stochastic Lyapunov direct method, and the convergence of the developed ADP method is given. Finally, four simulations illustrate the effectiveness of the developed method.

最优控制随机系统自适应动态规划非线性系统