Feature Misspecification in Sequential Learning Problems
研究了决策者在序贯学习中假设特征与系统性能为线性关系,但实际模型可能错误,导致样本选择内生性并影响性能。提出了一种前瞻性采样原则,能在样本量增大时消除错误影响,适用于多种采样策略。
We consider a class of sequential learning problems where a decision maker must learn the unknown statistical characteristics of a finite set of alternatives (or systems) using sequential sampling to ultimately select a subset of “good” alternatives. A salient feature of our problem is that system performance is governed by a set of features. The decision maker postulates the dependence on these features to be linear, but this model may not precisely represent the true underlying system structure. We show that this misspecification, if not managed properly, can lead to suboptimal performance because of a phenomenon identified as sample-selection endogeneity. We propose a prospective sampling principle—a new approach that eliminates the adverse effects of misspecification as the number of samples grows large. The proposed principle applies across a very general class of widely used sampling policies, enjoys strong asymptotic performance guarantees, and exhibits effective finite-sample performance in numerical experiments. This paper was accepted by Vivek Farias, data science. Funding: This work was supported by the United States-Israel Binational Science Foundation [Grant 2020063] and the Hong Kong Research Grant Council [GRF Grant 16501821 and ECS Grant 24210420]. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2022.00328 .