Optimal Experimentation in a Changing Environment
研究垄断者在未知且随机变化的需求曲线下,如何通过最优实验策略最大化无限期利润,发现两种截然不同的策略区间,取决于折现率和需求切换强度。
This paper studies optimal experimentation by a monopolist who faces an unknown demand curve subject to random changes, and who maximizes profits over an infinite horizon in continuous time. We show that there are two qualitatively very different regimes, determined by the discount rate and the intensities of demand curve switching, and the dependence of the optimal policy on these parameters is discontinuous. One regime is characterized by extreme experimentation and good tracking of the prevailing demand curve, the other by moderate experimentation and poor tracking. Moreover, in the latter regime the agent eventually becomes "trapped" into taking actions in a strict subset of the feasible set.