🌙

高维数据中通过并行学习实现快速关联恢复

Fast Association Recovery in High Dimensions by Parallel Learning

INFORMS journal on computing · 2025
被引 0
人大 BUTD24ABS 3

中文导读

针对大规模高维数据中稀疏降秩回归计算慢的问题,提出一种并行算法,将问题分解为多个并行的子问题,在保证统计一致性的同时大幅提升计算速度,并通过遗传学应用验证了有效性。

Abstract

Sparse reduced-rank regression is a widespread tool to reveal the association between multiple responses and predictors, and it has been widely applied to many data-driven applications. Although much of the literature has studied related theoretical properties and numerical algorithms, due to high nonconvexity, the computation burden for large-scale data sets remains a great challenge in practice. Also, the gap between the statistical consistency and the algorithmic convergence needs more research. To address these two issues, we formulate a sparse reduced-rank regression as a set of parallel cosparse unit-rank estimation problems and propose a new algorithm to estimate these subproblems in parallel. Under mild conditions, the iteration complexity of the proposed algorithm is polynomial with high-dimensional responses and predictors. We show a statistical consistency for the numerical solution, thereby bridging the gap between statistical consistency and numerical computation from nonconvex optimization. Moreover, the main calculation of the algorithm is restricted to a small active set, so it exhibits fast computation even in high dimensions. Extensive numerical studies and an application in genetics demonstrate the effectiveness and scalability of our approach. History: Accepted by Antonio Frangioni, Area Editor for Design & Analysis of Algorithms–Continuous. Funding: This work was supported by the National Key R&D Program of China [Grant 2024YFA1012200], the National Natural Science Foundation of China [Grants 12171449 and 72401266], the Fundamental Research Funds for the Central Universities [Grant WK2040000079], the USTC Research Funds of the Double First-Class Initiative [Grant YD2040002019], and the China Postdoctoral Science Foundation [Grant 2023M733402]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0691 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2024.0691 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .

稀疏降秩回归高维统计并行计算非凸优化