Dynamic Pricing with Unknown Nonparametric Demand and Limited Price Changes
研究零售商在未知需求函数且价格变动有成本时,如何通过有限次价格调整实现接近最优的累计收益,提出一种基于二阶近似的定价策略,数值实验表明其大幅减少价格变动次数且遗憾表现相当。
Dynamic pricing with demand learning is a very common problem that retailers face. In this problem, the retailer aims to maximize cumulative revenue collected over a finite time horizon by balancing two objectives: learning demand and maximizing revenue. In their paper “Dynamic Pricing with Unknown Nonparametric Demand and Limited Price Changes,” Perakis and Singhvi study this problem when the retailer makes no parametric assumption on the demand and seeks to reduce the amount of price experimentation because of the potential costs associated with price changes. They construct a pricing policy that uses second order approximations of the unknown demand function and establish when the proposed policy achieves near-optimal rate of regret while making very limited price changes. They also perform extensive numerical experiments to show that their proposed policy substantially improves over existing methods in terms of the total price changes, with comparable performance on the cumulative regret metric.