无显式解的高维连续时间随机最优控制的深度学习

Deep Learning for High-Dimensional Continuous-Time Stochastic Optimal Control Without Explicit Solution

Operations Research · 2026

被引 0

人大 AFT50UTD24ABS 4*

Jean-Loup Dupret · 苏黎世联邦理工学院
Donatien Hainaut · 法语鲁汶天主教大学

中文导读

提出广义策略迭代物理信息神经网络，求解高维连续时间随机最优控制问题，无需显式解，通过分离网络近似值函数和多维最优控制，在最优执行等应用中验证了准确性和收敛性。

Abstract

Multiasset Optimal Execution via Deep Learning for High-Dimensional Continuous-Time Stochastic Control In “Deep Learning for High-Dimensional Continuous-Time Stochastic Optimal Control Without Explicit Solution,” Dupret and Hainaut introduce the generalized policy iteration physics-informed neural network, a novel deep learning algorithm for solving high-dimensional continuous-time stochastic optimal control problems even when the optimal control does not admit explicit solution. The method combines physics-informed neural networks with an actor-critic structure based on generalized policy iteration and uses separate networks to approximate both the value function and the multidimensional optimal control. This approach provides a global approximation of the solution across time and space, enabling fast online evaluation. Theoretical guarantees on convergence and optimality are provided, whereas its accuracy and efficacy are empirically validated through two important numerical examples from operations research. Thereby, the authors generalize the Almgren–Chriss framework arising from optimal execution in finance by allowing both temporary and permanent price impacts to be fully nonlinear and by considering a multidimensional setting with multiple cointegrated assets.

随机控制深度学习最优执行运营研究金融工程

阅读原文 ↗