Deep Learning for High-Dimensional Continuous-Time Stochastic Optimal Control Without Explicit Solution
提出广义策略迭代物理信息神经网络,求解高维连续时间随机最优控制问题,无需显式解,通过分离网络近似值函数和多维最优控制,在最优执行等应用中验证了准确性和收敛性。
Multiasset Optimal Execution via Deep Learning for High-Dimensional Continuous-Time Stochastic Control In “Deep Learning for High-Dimensional Continuous-Time Stochastic Optimal Control Without Explicit Solution,” Dupret and Hainaut introduce the generalized policy iteration physics-informed neural network, a novel deep learning algorithm for solving high-dimensional continuous-time stochastic optimal control problems even when the optimal control does not admit explicit solution. The method combines physics-informed neural networks with an actor-critic structure based on generalized policy iteration and uses separate networks to approximate both the value function and the multidimensional optimal control. This approach provides a global approximation of the solution across time and space, enabling fast online evaluation. Theoretical guarantees on convergence and optimality are provided, whereas its accuracy and efficacy are empirically validated through two important numerical examples from operations research. Thereby, the authors generalize the Almgren–Chriss framework arising from optimal execution in finance by allowing both temporary and permanent price impacts to be fully nonlinear and by considering a multidimensional setting with multiple cointegrated assets.