Rank tests for PCA under weak identifiability
研究了在弱可识别性下,对椭圆分布形状矩阵的主特征向量进行假设检验的秩检验方法,证明了其渐近性质优于参数方法,并通过蒙特卡洛模拟验证。
In a triangular array framework where n observations are randomly sampled from a p-dimensional elliptical distribution with shape matrix Vn, we consider the problem of testing the null hypothesis H0:θ=θ0 against the alternative hypothesis H1:θ≠θ0, where θ is the (fixed) leading unit eigenvector of Vn and θ0 is a given unit p-vector. The dependence of the shape matrix on the sample size allows us to consider challenging asymptotic scenarios in which the parameter of interest θ is unidentified in the limit, because the ratio between both leading eigenvalues of Vn converges to one. We carefully study the corresponding limiting experiments under such weak identifiability, and we show that these may be LAN or non-LAN. While earlier work in the framework was strictly limited to Gaussian distributions, where the study of local log-likelihood ratios could simply rely on explicit expressions, our asymptotic investigation allows for essentially arbitrary elliptical distributions. This requires original results on quadratic mean differentiable families for triangular arrays of observations, which are likely to be of interest in other models, too. Even in non-LAN experiments, our results enable us to investigate, through Le Cam’s first and third lemmas, the asymptotic null and nonnull properties of multivariate rank tests. These nonparametric tests are shown to exhibit an excellent behavior under weak identifiability: not only do they maintain the target nominal size irrespective of the amount of weak identifiability, but they also keep their outstanding uniform efficiency properties under such nonstandard scenarios. In particular, Gaussian-score rank tests, under arbitrarily weak identifiability, still uniformly dominate their parametric pseudo-Gaussian competitor in terms of asymptotic relative efficiencies. Our theoretical results, which are the first ones to study rank tests in the triangular array framework allowing for weak identifiability, are supported by several Monte Carlo exercises.