超越独立同分布:异质环境中的数据驱动决策

Beyond IID: Data-Driven Decision Making in Heterogeneous Environments

Management Science · 2025
被引 0
人大 A+FT50UTD24ABS 4*

中文导读

研究了当历史数据与未来分布存在未知差异时,如何利用数据做决策,分析了样本均值近似等策略的表现,并针对报童、定价等问题设计了最优策略。

Abstract

How should one leverage historical data when past observations are not perfectly indicative of the future, for example, because of the presence of unobserved confounders which one cannot “correct” for? Motivated by this question, we study a data-driven decision-making framework in which historical samples are generated from unknown and different distributions assumed to lie in a heterogeneity ball with known radius and centered around the (also) unknown future (out-of-sample) distribution on which the performance of a decision will be evaluated. This work aims to analyze the performance of central data-driven policies and also near-optimal ones in these heterogeneous environments, and it aims to understand key drivers of performance. We establish a first result that allows us to upper bound the asymptotic worst-case regret of a broad class of policies. Leveraging this result, for any integral probability metric, we provide a general analysis of the performance achieved by sample average approximation (SAA) as a function of the radius of the heterogeneity ball. This analysis is centered around the approximation parameter, a notion of complexity we introduce to capture how the interplay between the heterogeneity and the problem structure impacts the performance of SAA. In turn, we illustrate, through several widely studied problems—for example, newsvendor, pricing—how this methodology can be applied and find that the performance of SAA varies considerably depending on the combinations of problem classes and heterogeneity. The failure of SAA for certain instances motivates the design of alternative policies to achieve rate optimality. We derive problem-dependent policies achieving strong guarantees for the illustrative problems described above and provide initial results toward a principled approach for the design and analysis of general rate-optimal algorithms. This paper was accepted by Vivek Farias, data science. Supplemental Material: The online appendix is available at https://doi.org/10.1287/mnsc.2022.03448 .

异质性数据驱动决策遗憾界样本均值近似