Contextual Inverse Optimization: Offline and Online Learning
研究如何从专家过去的最优决策数据中逆向推断其决策过程,并量化离线与在线两种数据收集方式下可达到的模仿性能。
Learning from data are critical across applications. However, in many applications, past data only gives partial information about the future. In “Contextual Inverse Optimization: Offline and Online Learning,” Besbes, Fonseca, and Lobel study a general setting in which historical data are associated with observations of past optimal actions from experts in specific contexts but without the underlying rewards associated with these actions. To what extent can one “reverse engineer” the underlying decision-making process of experts and mimic them? The authors develop results that quantify the performance that is achievable given the data at hand in two types of settings: the offline setting in which data have already been collected and the online setting in which data are collected “on the fly.”