🌙

样本设计对主成分分析的影响

The Effect of Sample Design on Principal Component Analysis

Journal of the American Statistical Association · 1986
被引 7
ABS 4

中文导读

研究了抽样设计(如分层抽样)对主成分分析中协方差矩阵特征值和特征向量估计的影响,比较了常规估计与概率加权、最大似然估计的偏差。

Abstract

Abstract Most sample surveys are multivariate and many lend themselves to multivariate methods of analysis. The most usual mode of such analysis is a standard statistical package, such as BMDP or SPSS, in which the multivariate analyses are based on the underlying assumption that the data are generated as independent observations from a common probability distribution. This assumption ignores the sample selection procedure involved in the survey, which leads to the following basic questions. What effects can the sample design have on methods of multivariate analysis? How should such effects be taken into account? This article considers the case of principal component analysis and, in particular, the point estimation of the eigenvalues and eigenvectors of a covariance matrix. It is assumed that the selection of the sample depends on the population values of auxiliary variables as, for example, in stratified sampling. The conventional estimators, based on the assumption of simple random sampling, are compared with alternative probability-weighted and maximum likelihood estimators. Under a multivariate normal model, simple expressions are presented for the approximate model bias of the different estimators. The validity of these results is assessed in a simulation study involving a disproportionate stratified design.

抽样调查多元统计分析主成分分析计量经济学