数据获取中的成本受限特征选择

Cost-Restricted Feature Selection for Data Acquisition

Management Science · 2022
被引 8
人大 A+FT50UTD24ABS 4*

中文导读

研究了在消费者数据获取中,不同特征有不同获取成本时,如何在预算约束下选择特征以最小化预测误差,适用于线性回归和逻辑回归,并提供了求解方法和实验验证。

Abstract

When acquiring consumer data for marketing or new business initiatives, it is important to decide what attributes or features of potential customers should be acquired. We study a new feature selection problem in the context of customer data acquisition in which different features have different acquisition costs. This feature selection problem is studied for linear regression and logistic regression. We formulate the feature selection and acquisition problems as nonlinear discrete optimization problems that minimize prediction errors subject to a budget constraint. We derive the analytical properties of the solutions for the problems, develop a computational procedure for solving the problems, provide an intuitive interpretation for the feature selection criteria, and discuss managerial implications of the solution approach. The results of the experimental study demonstrate the effectiveness of our approach. This paper was accepted by Kartik Hosanagar, information systems. Supplemental Material: Data are available at https://doi.org/10.1287/mnsc.2022.4551 .

成本受限特征选择数据获取线性回归逻辑回归