Optimal Learning with Endogenous Data
研究学习问题中ε最优性的必要性与含义,考虑贝叶斯决策者在当前奖励与信息积累间的权衡,证明存在ε最优策略可学习任何可识别参数,但其他ε最优策略的极限行为可能大不相同。
This paper is concerned with the need for, and the implications of, $-optimality in learning problems. The authors consider a control problem in which a Bayesian decisionmaker faces a trade-off between expected current reward and accumulation of information. An example showing the need for the notion of $-optimality and the possibility of discontinuous transition functions is given. It is shown that there is always an $-optimal policy that allows the decisionmaker to learn any identified parameters, but that there are other $-optimal policies with very different limit behavior. Copyright 1989 by Economics Department of the University of Pennsylvania and the Osaka University Institute of Social and Economic Research Association.