无限期折现马尔可夫决策过程中的决策滚动与水平滚动过程

Decision Roll and Horizon Roll Processes in Infinite Horizon Discounted Markov Decision Processes

Management Science · 1996
被引 5
人大 A+FT50UTD24ABS 4*

中文导读

研究参数信息仅限有限时间范围的无限期马尔可夫决策过程,提出一般框架,给出两种规划水平决策过程的最优性损失界限,并探讨决策过程与规划水平的选择。

Abstract

In this paper we look at some aspects of infinite horizon Markov decision processes in which information regarding parameter values is restricted to a finite time horizon, and in which decisions are based upon the finite horizon data but are recomputed as we move forward in time and gain knowledge of later parametric values. A general framework is given. Bounds on the loss of optimality arising from two planning horizon decision processes are given, and the choice of decision process and planning horizon is examined.

无限期折扣马尔可夫决策过程决策滚动水平滚动有限期数据