主动特征值获取

Active Feature-Value Acquisition

Management Science · 2009
被引 111
人大 A+FT50UTD24ABS 4*

中文导读

提出一种基于信息价值估计的主动特征值获取框架,通过采样期望效用策略在信息有限时有效排序潜在获取,实验表明相比代表性采样能降低建模成本并保持稳定性能。

Abstract

Most induction algorithms for building predictive models take as input training data in the form of feature vectors. Acquiring the values of features may be costly, and simply acquiring all values may be wasteful or prohibitively expensive. Active feature-value acquisition (AFA) selects features incrementally in an attempt to improve the predictive model most cost-effectively. This paper presents a framework for AFA based on estimating information value. Although straightforward in principle, estimations and approximations must be made to apply the framework in practice. We present an acquisition policy, sampled expected utility (SEU), that employs particular estimations to enable effective ranking of potential acquisitions in settings where relatively little information is available about the underlying domain. We then present experimental results showing that, compared with the policy of using representative sampling for feature acquisition, SEU reduces the cost of producing a model of a desired accuracy and exhibits consistent performance across domains. We also extend the framework to a more general modeling setting in which feature values as well as class labels are missing and are costly to acquire.

主动特征值获取信息价值估计采样期望效用成本效益建模