🌙

回归中的异常值检测:锥二次规划形式

Outlier Detection in Regression: Conic Quadratic Formulations

INFORMS journal on computing · 2025
被引 0
人大 BUTD24ABS 3

中文导读

针对线性回归中异常值检测问题,提出不使用大M约束的更强二阶锥松弛,计算实验表明比现有方法快几个数量级,且精确求解能获得比启发式更好的解。

Abstract

In many applications, when building linear regression models, it is important to account for the presence of outliers, that is, corrupted input data points. Such problems can be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature, typically relying on the linearization of the cubic terms using big-M constraints, suffer from weak relaxation and poor performance in practice. In this work we derive stronger second-order conic relaxations that do not involve big-M constraints. Our computational experiments indicate that the proposed formulations are several orders of magnitude faster than existing big-M formulations in the literature for this problem. In addition, we verify that exact solution of the mixed-integer optimization problems can lead to substantially better-quality solutions than simpler relaxations or heuristics commonly used in practice. History: Accepted by Andrea Lodi, Area Editor for Design & Analysis of Algorithms—Discrete. Funding: This work was supported by a public grant as part of the Investissement d’avenir project [Grant ANR-11-LABX-0056-LMH, LabEx LMH]. A. Gómez is supported by the US Air Force Office of Scientific Research [Grant FA9550-22-1-0369]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2025.1215 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2025.1215 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .

异常值检测线性回归混合整数优化二阶锥松弛计算实验