Understanding Managers’ Trade-Offs Between Exploration and Exploitation
通过激励对齐实验和动态学习模型,研究管理者在新产品开发、定价等决策中如何平衡探索与利用,发现决策者过度探索低效选项,损失超30%潜在收入,且风险厌恶者探索更久。
Managers frequently explore new strategies, and exploit familiar ones, when making decisions on new product development, pricing, or advertising. Exploring for too long, or exploiting too soon, will generate inferior financial returns. Our research describes decision makers’ exploration/exploitation trade-offs and their link to psychometric traits. We conduct an incentive-aligned study in which subjects play a multiarmed bandit experiment and evaluate how subjects balance exploration and exploitation, linked to psychometric traits. To formally describe exploration/exploitation trade-offs, we develop a behavioral model that captures latent dynamics in learning behavior. Subjects transition between three unobserved states—exploration, exploitation, and inertia—updating their beliefs about expected payoffs. Our analysis suggests that decision makers overexplore low-performing options, forgoing over 30% of potential revenue. They heavily rely on recent experiences. Risk-averse decision makers spend more time exploring. Maximizers are more sensitive to payoffs than satisficers. Our research builds the groundwork needed to devise remedial actions aimed at helping managers find an optimal balance between exploration and exploitation. One way to achieve this goal is by carefully designing the learning environment. In two additional studies, we analyze the evolution of exploration/exploitation trade-offs across different learning environments. Offering decision makers repeated opportunities to learn and increasing the planning horizon appears beneficial.