数据驱动多周期库存控制问题中的风险厌恶

Risk Aversion in a Data-Driven Multi-period Inventory Control Problem

Production and Operations Management · 2026

被引 0 · 同刊同年前 6%

人大 AFT50UTD24ABS 4

Xianghua Jiang · 香港中文大学
Xun Zhang · 南方科技大学通讯
Loon‐Ching Tang · 新加坡国立大学
Zhisheng Ye · 新加坡国立大学

中文导读

研究了零售商在仅知历史需求数据、不知需求分布的情况下，如何制定多周期风险厌恶库存策略，并证明了数据驱动策略的样本复杂度边界，确保其风险接近最优。

Abstract

We study multi-period risk-averse inventory control in a data-driven setting. In this problem, a risk-averse retailer makes periodic decisions on inventory levels based only on historical demand observations without full knowledge of the demand distribution. We adopt the popular nested formulation for risk-averse programs to formulate this multi-period problem and its data-driven counterpart under a coherent risk measure. Our objective is to study the sample complexity bound such that with high probability, the data-driven policy is near-optimal, that is, the relative error of risk under the data-driven policy compared with the optimal risk is arbitrarily small. Analysis of this problem is inherently challenging, because the multi-period nature requires solving the risk-averse program and its data-driven version recursively backward in time, while the (empirical) risk-to-go functions in this process do not have closed-form derivatives for most risk measures, which renders existing first-order methods for the risk-neutral newsvendor model invalid. In this study, we develop a zeroth-order framework to establish the complexity bound on sample sizes to guarantee near-optimality of the data-driven policy with given accuracy levels. Instead of using first-order derivative information on the risk-to-go function, our analysis directly examines the class of functions that underpins each cumulative risk function and derives maximum inequalities for this functional class by computing the covering numbers. Finite-sample complexity bounds are then used to establish asymptotic properties of the estimated risk, including consistency and convergence rate. Computationally, the time complexity for solving the data-driven policy, which is essentially an empirical dynamic programming (EDP) estimator of the optimal policy, increases exponentially in the length of the planning horizon. To speed up computation, we propose an approximation scheme that recursively approximates the empirical cumulative risk function with a convex piecewise linear function and then minimize it to obtain a modified data-driven inventory policy. We show that with proper control for approximation error, the modified data-driven policy is also near-optimal, and it has the same order of sample complexity bound as that for the original EDP policy.

库存控制风险厌恶数据驱动决策动态规划样本复杂度

阅读原文 ↗