Optimal Control of an Unknown Linear Process with Learning
研究一个经济主体在未知线性过程中,如何权衡当前收益与信息获取,通过贝叶斯更新来优化行动策略,并探讨最优策略的存在性及信念与行动的极限行为。
Rational expectations models raise naturally the question of The hypothesis that agents process information efficiently in an effort to learn about their environment is much stronger than the hypothesis that profit opportunities do not systematically go unexploited. However, the efficient information-processing hypothesis is itself much weaker than the hypothesis that agents actively seek to learn about their environment, even when learning is costly. This is the case known as active learning. A natural modelling strategy is to assume that agents, faced with a tradeoff between current-period reward and information generation, allocate their efforts optimally given their beliefs about the economy. This turns out to be a difficult problem to study, and we focus attention in this paper on the optimizing behavior of an agent in an economy in which his behavior can generate information. For simplicity the process that we study is the linear regression process with independent errors. The agent expresses his beliefs about unknown parameters, which can include parameters of the error process as well as regression coefficients, in the form of a probability distribution. At date t the agent chooses an action on the basis of his beliefs at that date. The action is chosen taking account of the one-period reward resulting from the action and of the value of the expected information gain from that action. After the action is taken an outcome is observed. The outcome, together with the action, add to the agent's stock of information about the process generating the data. We assume that the agent processes this information in accordance with the laws of probability, i.e., by Bayes' Rule. In this paper we are concerned with the existence of the optimal strategy, with the limiting behavior of the sequence of the agent's beliefs, and with the limiting behavior of the optimal action. The linear structure has been used before in economics to study the question of l We thank David Easley and Ingmar Prucha for helpful discussions on the topic treated here. Related material has been presented at the NBER-NSF Seminar on Bayesian Inference, the University of Iowa and the Conference on Dynamic Econometrics at Austin, Texas. We are grateful to participants for their suggestions. This research is supported in part by the NSF.