交换嵌套不动点算法:一类离散马尔可夫决策模型的估计量

Swapping the Nested Fixed Point Algorithm: A Class of Estimators for Discrete Markov Decision Models

Econometrica · 2002
被引 346
人大 A+FT50ABS 4*

中文导读

提出嵌套伪似然法(NPL)来估计离散马尔可夫决策模型,证明其与最大似然估计等价,并定义了一类序贯策略迭代估计量,在有限样本精度和计算成本之间提供权衡。

Abstract

This paper proposes a procedure for the estimation of discrete Markov decision models and studies its statistical and computational properties.Our Nested Pseudo-Likelihood method (NPL) is similar to Rust's Nested Fixed Point algorithm (NFXP), but the order of the two nested algorithms is swapped.First, we prove that NPL produces the Maximum Likelihood Estimator under the same conditions as NFXP.Our procedure requires fewer policy iterations at the expense of more likelihood-climbing iterations.We focus on a class of in…nite-horizon, partial likelihood problems for which NPL results in large computational gains.Second, based on this algorithm we de…ne a class of consistent and asymptotically equivalent Sequential Policy Iteration (PI) estimators, which encompasses both Hotz-Miller's CCP estimator and the partial Maximum Likekihood estimator.This presents the researcher with a "menu" of sequential estimators re ‡ecting a trade-o¤ between …nite-sample precision and computational cost.Using actual and simulated data we compare the relative performance of these estimators.In all our experiments the bene…ts in terms of precision of using a 2-stage PI estimator instead of 1-stage (i.e., Hotz-Miller) are very signi…cant.More interestingly, the bene…ts of MLE relative to 2-stage PI are small.

离散马尔可夫决策模型序贯策略迭代估计量