Cost-Restricted Feature Selection for Data Acquisition
研究了在消费者数据获取中,不同特征有不同获取成本时,如何在预算约束下选择特征以最小化预测误差,适用于线性回归和逻辑回归,并提供了求解方法和实验验证。
When acquiring consumer data for marketing or new business initiatives, it is important to decide what attributes or features of potential customers should be acquired. We study a new feature selection problem in the context of customer data acquisition in which different features have different acquisition costs. This feature selection problem is studied for linear regression and logistic regression. We formulate the feature selection and acquisition problems as nonlinear discrete optimization problems that minimize prediction errors subject to a budget constraint. We derive the analytical properties of the solutions for the problems, develop a computational procedure for solving the problems, provide an intuitive interpretation for the feature selection criteria, and discuss managerial implications of the solution approach. The results of the experimental study demonstrate the effectiveness of our approach. This paper was accepted by Kartik Hosanagar, information systems. Supplemental Material: Data are available at https://doi.org/10.1287/mnsc.2022.4551 .