Use of the Bootstrap and Cross-Validation in Ridge Regression
回顾了岭回归中岭参数选择的几种现有方法,并提出一种自助法。通过蒙特卡洛模拟比较了自助法和竞争方法在不同共线性程度和信噪比下的表现,并用两个数据集验证了自助法能降低预测均方误差,且更客观、易实现、稳健。
Abstract Several existing methods for the choice of the ridge parameter are reviewed, and a bootstrap method is proposed. The bootstrap provides independent measures of prediction errors based on multiple predictions along with an estimate of the standard error of prediction. The bootstrap and selected competitors are compared through Monte Carlo simulations for various degrees of design matrix collinearity and varying levels of signal-to-noise ratio. The procedure is also illustrated by application to two published data sets. In one case, the bootstrap choice of the ridge parameter leads to a smaller mean squared error of prediction than the ridge trace method. In the second case, an optimal choice of no perturbation is confirmed. Benefits of the bootstrap choice include its less subjective nature, ease of implementation, and robustness. KEY WORDS: Average mean squared error of predictionBiased estimatesCollinear dataCondition numberRidge parameterRobust inferenceSignal-to-noise ratio