Optimal Learning by Experimentation
研究单一决策者如何通过实验进行最优学习,重点分析极限信念和行动的特征,发现收益函数的局部性质(如光滑性)决定能否最终达到真实最大收益。
This paper considers a problem of optimal learning by experimentation by a single decision maker. Most of the analysis is concerned with the characterisation of limit beliefs and actions. We take a two-stage approach to this problem: first, understand the case where the agent's payoff function is deterministic; then, address the additional issues arising when noise is present. Our analysis indicates that local properties of the payoff function (such as smoothness) are crucial in determining whether the agent eventually attains the true maximum payoff or not. The paper also makes a limited attempt at characterising optimal experimentation strategies.