Collaborative Learning and Decision Making on Pricing and Recommendation: A Simple Framework for Planning
研究多团队协作定价与推荐决策问题,提出贪心算法和GWS算法,在集中或分散规划下平衡学习与收益,并通过模拟验证有效性。
We formulate a collaborative learning and decision-making problem involving contextual information. In current business practices, pricing and recommendation decisions often are made jointly by multiple teams in sequence. The decision-making processes for different teams can be controlled by either a centralized or decentralized planner. We propose a simple collaboration framework that integrates the learning about decision making in an unknown environment. The main challenge in a decentralized framework is that the decision-making process in other teams is unknown, but the subsequent decisions are mutually dependent. From a practical concern about high exploration costs and implementation complexity, we propose a simple greedy algorithm for centralized planners and a “greedy” + “weighted sampling” (GWS) algorithm for both centralized and decentralized planners to balance the learning and earning. We show that the exploration-free greedy algorithm can achieve the optimal rate when context diversity holds. The GWS algorithm works effectively for either centralized or decentralized planners under a much weaker condition, which we call context variation. Furthermore, we extend our framework to the multiproduct pricing and ranking problem and study the model misspecification issue. We validate our results using simulations on synthetic and real data. Numerical studies show the superior performance of the two proposed frameworks for different types of planners. This paper was accepted by J. George Shanthikumar, data science. Supplemental Material: The online appendices and data files are available at https://doi.org/10.1287/mnsc.2023.00320 .