🌙

一种抗异常值的秩相关系数

A Rank Correlation Coefficient Resistant to Outliers

Journal of the American Statistical Association · 1987
被引 22
ABS 4

中文导读

本文定义了一种基于最大偏差原理的非参数相关系数Rg,易于手工计算,在存在偏向性异常值时比Pearson、Spearman和Kendall相关系数更稳健,并通过实际数据分析展示了其独特的相关性度量方式。

Abstract

In this article, a nonparametric correlation coefficient is defined that is based on the principle of maximum deviations. This new correlation coefficient, Rg , is easy to compute by hand for small to medium sample sizes. In comparing it with existing correlation coefficients, it was found to be superior in a sampling situation that we call "biased outliers," and hence appears to be more resistant to outliers than the Pearson, Spearman, and Kendall correlation coefficients. In a correlational study not included in this article of some social data consisting of five variables for each of 51 observations, Rg was compared with the other three correlation coefficients. There was agreement on 8 of the 10 possible correlations, but in one case, Rg was significant when the others were not, and in yet another case, Rg was not significant when the others were. A further analysis of this data set indicated that there were three to six data points that were anomalies and had a severe effect on the other correlations but not Rg . Apparently, the statistic Rg measures association in a unique fashion. This different measure of association for real data is extended to a population interpretation and expressed in terms of the copula function.

统计学非参数统计相关性分析异常值处理