随机动态线性规划：多阶段随机线性规划的序贯抽样算法

Stochastic Dynamic Linear Programming: A Sequential Sampling Algorithm for Multistage Stochastic Linear Programming

SIAM Journal on Optimization · 2021

被引 9

ABS 3

Harsha Gangammanavar
Suvrajeet Sen

中文导读

提出一种序贯抽样算法（SDLP），用于解决多阶段随机线性规划问题，无需预先构建场景树或样本均值近似，决策过程可递归吸收新数据，并证明其渐近收敛性。

Abstract

Multistage stochastic programming deals with operational and planning problems that involve a sequence of decisions over time while responding to an uncertain future. Algorithms designed to address multistage stochastic linear programming (MSLP) problems often rely upon scenario trees to represent the underlying stochastic process. When this process exhibits stagewise independence, sampling-based techniques, particularly the stochastic dual dynamic programming algorithm, have received wide acceptance. However, these sampling-based methods still operate with a deterministic representation of the problem which uses the so-called sample average approximation. In this work, we present a sequential sampling approach for MSLP problems that allows the decision process to assimilate newly sampled data recursively. We refer to this method as the stochastic dynamic linear programming (SDLP) algorithm. Since we use sequential sampling, the algorithm does not necessitate a priori representation of uncertainty, through either a scenario tree or sample average approximation, both of which require a knowledge/estimation of the underlying distribution. This method constitutes a generalization of the stochastic decomposition algorithm for two-stage stochastic linear programming models. The approximations used within SDLP may be viewed either through the lens of proximal methods or via regularization. Furthermore, we introduce the notion of basic feasible policies which provide a piecewise affine solution discovery scheme, which is embedded within the optimization algorithm to identify incumbent solutions used in the context of proximal iterations. Finally, we show that the SDLP algorithm provides a sequence of decisions and corresponding value function estimates along a sequence of state trajectories that asymptotically converge to their optimal counterparts, with probability one.

随机规划动态规划线性规划优化算法序贯抽样

阅读原文 ↗