Robust Optimization and Data Classification for Characterization of Huntington Disease Onset via Duality Methods
针对亨廷顿病临床数据可能不准确的问题,应用鲁棒优化和对偶技术改进支持向量机分类器,在多种不确定性模型下给出可数值求解的半定规划或二次规划重述,并在大型HD数据集上实现超过95%的分类准确率,同时筛选出高度相关的特征。
Abstract The features that characterize the onset of Huntington disease (HD) are poorly understood yet have significant implications for research and clinical practice. Motivated by the need to address this issue, and the fact that there may be inaccuracies in clinical HD data, we apply robust optimization and duality techniques to study support vector machine (SVM) classifiers in the face of uncertainty in feature data. We present readily numerically solvable semi-definite program reformulations via conic duality for a broad class of robust SVM classification problems under a general spectrahedron uncertainty set that covers the most commonly used uncertainty sets of robust optimization models, such as boxes, balls, and ellipsoids. In the case of the box-uncertainty model, we also provide a new simple quadratic program reformulation, via Lagrangian duality, leading to a very efficient iterative scheme for robust classifiers. Computational results on a range of datasets indicate that these robust classification methods allow for greater classification accuracies than conventional support vector machines in addition to selecting groups of highly correlated features. The conic duality-based robust SVMs were also successfully applied to a new, large HD dataset, achieving classification accuracies of over 95% and providing important information about the features that characterize HD onset.