基于在线逆批梯度下降法的网络收益管理

Network revenue management with online inverse batch gradient descent method

Production and Operations Management · 2023

被引 35

人大 AFT50UTD24ABS 4

Yiwei Chen · 天普大学通讯
Cong Shi · 迈阿密大学

中文导读

针对企业不知需求函数、需从销售数据学习的网络收益管理问题，提出在线逆批梯度下降算法，在市场份额空间做梯度下降再映射回价格，实现与产品数量无关的遗憾界。

Abstract

We consider a general class of price‐based network revenue management problems that a firm aims to maximize revenue from multiple products produced with multiple types of resources endowed with limited inventory over a finite selling season. A salient feature of our problem is that the firm does not know the underlying demand function that maps prices to demand rate, which must be learned from sales data. It is well known that for almost all classes of demand functions, such as linear, exponential, multinomial logit, and nested logit models, the revenue rate function is not concave in the products' prices but is concave in products' market shares (or price‐controlled demand rates). This creates challenges in adopting any stochastic gradient descent based methods in the price space. We propose a novel nonparametric learning algorithm termed the online inverse batch gradient descent (IGD) algorithm. This algorithm proceeds in batches. In each batch, the firm implements each product's perturbed prices and then uses the sales information to estimate the market shares. Leveraging these estimates, the firm carries out a stochastic gradient descent step in the market share space that takes into account the relative inventory scarcity for the entire horizon and then inversely maps the updated market shares back to the price space to obtain the prices for the next batch. Moreover, we also propose an inventory‐adjusted algorithm (IGD‐I) that the feasible market share set is dynamically adjusted to capture the real‐time relative inventory scarcity for the remaining season. For the large‐scale systems wherein all resources' inventories and the length of the horizon are proportionally scaled by a parameter k , we establish a dimension‐independent regret bound of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" overflow="scroll"> <mml:semantics definitionURL="" encoding=""> <mml:mrow> <mml:mi>O</mml:mi> <mml:mo stretchy="false">(</mml:mo> <mml:msup> <mml:mi>k</mml:mi> <mml:mrow> <mml:mn>4</mml:mn> <mml:mo>/</mml:mo> <mml:mn>5</mml:mn> </mml:mrow> </mml:msup> <mml:mi>log</mml:mi> <mml:mi>k</mml:mi> <mml:mo stretchy="false">)</mml:mo> </mml:mrow> <mml:annotation encoding="">$O(k^{4/5} \log k)$</mml:annotation> </mml:semantics> </mml:math> . This result is independent of the number of products and resources and works for a continuum action‐set prices and the demand functions that are only once differentiable . Our theoretical result guarantees the efficacy of both algorithms in the high‐dimensional systems where the number of products or resources is large and the prices are continuous. Our algorithms also numerically outperform the existing algorithms in the literature.

收益管理运筹学机器学习定价优化

阅读原文 ↗