利用的近端负债:多阶段问题中的探索与利用

Near-Term Liability of Exploitation: Exploration and Exploitation in Multistage Problems

ORGANIZATION SCIENCE · 2008
被引 69
人大 AFT50UTD24ABS 4*

中文导读

研究了多阶段决策中探索与利用的权衡,发现即使信念无偏,最大化利用也可能导致预期收益低于探索策略,探索能带来更稳健的行动。

Abstract

The classic trade-off between exploration and exploitation reflects the tension between gaining new information about alternatives to improve future returns and using the information currently available to improve present returns. By considering these issues in the context of a multistage, as opposed to a repeated, problem environment, we show that exploratory behavior has value quite apart from its role in revising beliefs. We show that even if current beliefs provide an unbiased characterization of the problem environment, maximizing with respect to these beliefs may lead to an inferior expected payoff relative to other mechanisms that make less aggressive use of the organization's beliefs. Search can lead to more robust actions in multistage decision problems than maximization, a benefit quite apart from its role in the updating of beliefs.

随机博弈决策理论组织学习多阶段决策