基于自然进化策略的稳定非线性动力系统机器人策略改进

Robot Policy Improvement With Natural Evolution Strategies for Stable Nonlinear Dynamical System

IEEE Transactions on Cybernetics · 2022

被引 56

ABS 3

Guang Chen
Zhijun Li
Yingbai Hu
Alois Knoll

中文导读

提出一种分层学习策略，通过自然进化策略优化动力系统参数，在保证全局稳定性的前提下提升机器人模仿学习的鲁棒性和适应性，适用于避障和刚度学习等任务。

Abstract

Robot learning through kinesthetic teaching is a promising way of cloning human behaviors, but it has its limits in the performance of complex tasks with small amounts of data, due to compounding errors. In order to improve the robustness and adaptability of imitation learning, a hierarchical learning strategy is proposed: low-level learning comprises only behavioral cloning with supervised learning, and high-level learning constitutes policy improvement. First, the Gaussian mixture model (GMM)-based dynamical system is formulated to encode a motion from the demonstration. We then derive the sufficient conditions of the GMM parameters that guarantee the global stability of the dynamical system from any initial state, using the Lyapunov stability theorem. Generally, imitation learning should reason about the motion well into the future for a wide range of tasks; it is significant to improve the adaptability of the learning method by policy improvement. Finally, a method based on exponential natural evolution strategies is proposed to optimize the parameters of the dynamical system associated with the stiffness of variable impedance control, in which the exploration noise is subject to stability conditions of the dynamical system in the exploration space, thus guaranteeing the global stability. Empirical evaluations are conducted on manipulators for different scenarios, including motion planning with obstacle avoidance and stiffness learning.

机器人学习模仿学习动力系统强化学习控制理论

阅读原文 ↗