The Prediction Sum of Squares as a General Measure for Regression Diagnostics
提出用Q2统计量(基于刀切残差的预测残差平方和)检测回归中的异常值或强影响点,并通过生活成本和食品行业数据展示其敏感性,建议将其作为数据质量检查的一部分。
Abstract Statistics that usually accompany the regression model do not provide insight into the quality of the data or the potential influence of the individual observations on the estimates. In this study, the Q2 statistic is used as a criterion for detecting influential observations or outliers. The statistic is derived from the jackknifed residuals, the squared sum of which is generally known as the prediction sum of squares or PRESS. This article compares R 2 with Q2 and suggests that the latter be used as part of the data-quality check. It is shown, for two separate data sets obtained from regional cost of living and U.S. food industry studies, that in the presence of outliers the Q2 statistic can be negative, because it is sensitive to the choice of regressors and the inclusion of influential observations. Once the outliers are dropped from the sample, the discrepancy between Q2 and R 2 values is negligible. KEY WORDS: Jackknife residualProportional reduction of errorQ2 R 2