🌙

缩放环面主成分分析

Scaled Torus Principal Component Analysis

Journal of Computational and Graphical Statistics · 2022
被引 11
ABS 3

中文导读

提出缩放环面主成分分析(ST-PCA),用于环形数据的降维,通过多维缩放将环面映射到球面,再使用主嵌套球面拟合数据,在天文学和分子生物学应用中优于现有方法。

Abstract

A particularly challenging context for dimensionality reduction is multivariate circular data, that is, data supported on a torus. Such kind of data appears, for example, in the analysis of various phenomena in environmental sciences and astronomy, as well as in molecular structures. This article introduces Scaled Torus Principal Component Analysis (ST-PCA), a novel approach to perform dimensionality reduction with toroidal data. ST-PCA finds a data-driven map from a torus to a sphere of the same dimension and a certain radius. The map is constructed with multidimensional scaling to minimize the discrepancy between pairwise geodesic distances in both spaces. ST-PCA then resorts to principal nested spheres to obtain a nested sequence of subspheres that best fits the data, which can afterwards be inverted back to the torus. Numerical experiments illustrate how ST-PCA can be used to achieve meaningful dimensionality reduction on low-dimensional torii, particularly with the purpose of clusters separation, while two data applications in astronomy (on a three-dimensional torus) and molecular biology (seven-dimensional torus) show that ST-PCA outperforms existing methods for the investigated datasets. Supplementary materials for this article are available online.

降维环形数据主成分分析多维缩放天文学