学习具有保证的约束参数化可微预测控制策略

Learning Constrained Parametric Differentiable Predictive Control Policies With Guarantees

IEEE Transactions on Systems, Man, and Cybernetics: Systems · 2024

被引 28 · 同刊同年前 9%

ABS 3

Ján Drgoňa
Aaron Tuor
Draguna Vrabie

中文导读

提出可微预测控制方法，离线学习非线性系统的约束神经控制策略，利用自动微分计算灵敏度，并基于Hoeffding不等式推导闭环稳定性和约束满足的概率保证，相比隐式/显式MPC和模型无关强化学习更高效。

Abstract

We present differentiable predictive control (DPC), a method for offline learning of constrained neural control policies for nonlinear dynamical systems with performance guarantees. We show that the sensitivities of the parametric optimal control problem can be used to obtain direct policy gradients. Specifically, we employ automatic differentiation (AD) to efficiently compute the sensitivities of the model predictive control (MPC) objective function and constraints penalties. To guarantee safety upon deployment, we derive probabilistic guarantees on closed-loop stability and constraint satisfaction based on indicator functions and Hoeffding’s inequality. We empirically demonstrate that the proposed method can learn neural control policies for various parametric optimal control tasks. In particular, we show that the proposed DPC method can stabilize systems with unstable dynamics, track time-varying references, and satisfy nonlinear state and input constraints. Our DPC method has practical time savings compared to alternative approaches for fast and memory-efficient controller design. Specifically, DPC does not depend on a supervisory controller as opposed to approximate MPC based on imitation learning. We demonstrate that, without losing performance, DPC is scalable with greatly reduced demands on memory and computation compared to implicit and explicit MPC while being more sample efficient than model-free reinforcement learning (RL) algorithms.

模型预测控制强化学习非线性系统最优控制人工智能

阅读原文 ↗