Decision Roll and Horizon Roll Processes in Infinite Horizon Discounted Markov Decision Processes
研究参数信息仅限有限时间范围的无限期马尔可夫决策过程,提出一般框架,给出两种规划水平决策过程的最优性损失界限,并探讨决策过程与规划水平的选择。
In this paper we look at some aspects of infinite horizon Markov decision processes in which information regarding parameter values is restricted to a finite time horizon, and in which decisions are based upon the finite horizon data but are recomputed as we move forward in time and gain knowledge of later parametric values. A general framework is given. Bounds on the loss of optimality arising from two planning horizon decision processes are given, and the choice of decision process and planning horizon is examined.