坐标高斯化：理论与应用

Coordinatewise Gaussianization: Theories and Applications

Journal of the American Statistical Association · 2022

被引 11

ABS 4

Di He
Qing Mai
Hui Zou 通讯

中文导读

研究了正态得分变换在坐标高斯化中的理论性质，证明其在高维情形下（log p=o(n/log n)）一致收敛于总体对应，并展示了在Gaussian copula模型、最近收缩质心分类器和距离相关中的应用，同时指出其失效场景。

Abstract

In statistical analysis, researchers often perform coordinatewise Gaussianization such that each variable is marginally normal. The normal score transformation is a method for coordinatewise Gaussianization and is widely used in statistics, econometrics, genetics and other areas. However, few studies exist on the theoretical properties of the normal score transformation, especially in high-dimensional problems where the dimension p diverges with the sample size n. In this article, we show that the normal score transformation uniformly converges to its population counterpart even when log p=o(n/ log n). Our result can justify the normal score transformation prior to any downstream statistical method to which the theoretical normal transformation is beneficial. The same results are established for the Winsorized normal transformation, another popular choice for coordinatewise Gaussianization. We demonstrate the benefits of coordinatewise Gaussianization by studying its applications to the Gaussian copula model, the nearest shrunken centroids classifier and distance correlation. The benefits are clearly shown in theory and supported by numerical studies. Moreover, we also point out scenarios where coordinatewise Gaussinization does not help and even causes damages. We offer a general recommendation on how to use coordinatewise Gaussianization in applications. Supplementary materials for this article are available online.

统计学计量经济学遗传学高维数据分析数据变换

阅读原文 ↗