Utility Fairness in Contextual Dynamic Pricing with Demand Learning
提出一种情境赌博机算法,在不确定需求下实现个性化定价并满足效用公平约束,达到最优遗憾上界,分析了公平性的成本及其对效用与收益平衡的影响。
This paper introduces a novel contextual bandit algorithm for personalized pricing under utility fairness constraints in scenarios with uncertain demand, achieving an optimal regret upper bound. Our approach, which incorporates dynamic pricing and demand learning, addresses the critical challenge of fairness in pricing strategies. We first delve into the static full-information setting to formulate an optimal pricing policy as a constrained optimization problem. Here, we propose an approximation algorithm for efficiently and approximately computing the ideal policy. We also use mathematical analysis and computational studies to characterize the structures of optimal contextual pricing policies subject to fairness constraints, deriving simplified policies that lay the foundations of more in-depth research and extensions. Further, we extend our study to dynamic pricing problems with demand learning, establishing a nonstandard regret lower bound that highlights the complexity added by fairness constraints. Our research offers a comprehensive analysis of the cost of fairness and its impact on the balance between utility and revenue maximization. This work represents a step toward integrating ethical considerations into algorithmic efficiency in data-driven dynamic pricing. This paper was accepted by J. George Shanthikumar, big data analytics. Funding: X. Chen acknowledges support from the National Science Foundation [Grant IIS-1845444]. D. Simchi-Levi thanks the MIT Data Science Lab for support. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2023.03956 .