Network Revenue Management with Nonparametric Demand Learning: T-Regret and Polynomial Dimension Dependency
研究了零售商在有限销售期内动态定价n种产品、受m种资源约束的网络收益管理问题,提出一种鲁棒椭球体方法,首次在非参数需求模型下实现形式为T乘以n的多项式函数的遗憾上界。
This paper studies the classic price-based network revenue management (NRM) problem with demand learning. The retailer dynamically decides prices of n products over a finite selling season (of length T) subject to m resource constraints, with the purpose of maximizing the cumulative revenue. In this paper, we focus on a nonparametric demand model with some mild technical assumptions which are satisfied by most of the commonly used demand functions. We propose a robust ellipsoid method adapted to the NRM setting in a nontrivial manner. This is the first result which achieves the regret of the form [Formula: see text] (where [Formula: see text] is a polynomial function of [Formula: see text]) in the current literature on the nonparametric NRM problem. Funding: S. Miao gratefully acknowledges financial support provided by the Ruegg Family Scholar and the Leeds School of Business. Supplemental Material: The online appendix is available at https://doi.org/10.1287/moor.2022.0086 .