带有外生变量的马尔可夫决策过程

Markov Decision Processes with Exogenous Variables

Management Science · 2019
被引 11
人大 A+FT50UTD24ABS 4*

中文导读

提出了两种求解带有外生变量的动态规划算法,即内生值迭代和内生策略迭代,它们至少与相对值迭代和相对策略迭代一样快,当内生变量比外生变量更快收敛到平稳分布时则更快。

Abstract

I present two algorithms for solving dynamic programs with exogenous variables: endogenous value iteration and endogenous policy iteration. These algorithms are always at least as fast as relative value iteration and relative policy iteration, and they are faster when the endogenous variables converge to their stationary distributions sooner than the exogenous variables. This paper was accepted by Yinyu Ye, optimization.

马尔可夫决策过程外生变量内生值迭代内生策略迭代