🌙

最佳子集选择的混合整数分数优化方法

A Mixed-Integer Fractional Optimization Approach to Best Subset Selection

INFORMS journal on computing · 2021
被引 20
人大 BUTD24ABS 3

中文导读

研究了线性回归中的最佳子集选择问题,提出用分数混合整数优化建模,利用现代求解器求解,计算实验表明比现有方法快几个数量级。

Abstract

We consider the best subset selection problem in linear regression—that is, finding a parsimonious subset of the regression variables that provides the best fit to the data according to some predefined criterion. We are primarily concerned with alternatives to cross-validation methods that do not require data partitioning and involve a range of information criteria extensively studied in the statistical literature. We show that the problem of interest can be modeled using fractional mixed-integer optimization, which can be tackled by leveraging recent advances in modern optimization solvers. The proposed algorithms involve solving a sequence of mixed-integer quadratic optimization problems (or their convexifications) and can be implemented with off-the-shelf solvers. We report encouraging results in our computational experiments, with respect to both the optimization and statistical performance. Summary of Contribution: This paper considers feature selection problems with information criteria. We show that by adopting a fractional optimization perspective (a well-known field in nonlinear optimization and operations research), it is possible to leverage recent advances in mixed-integer quadratic optimization technology to tackle traditional statistical problems long considered intractable. We present extensive computational experiments, with both synthetic and real data, illustrating that the new fractional optimization approach is orders of magnitude faster than existing approaches in the literature.

线性回归特征选择混合整数优化信息准则