On data-driven choice of [formula omitted] in nonparametric Gaussian regression via Propagation–Separation approach
提出一种基于交叉验证的数据驱动方法,用于选择非参数高斯回归中传播-分离方法的适应带宽λ,解决了该参数无法自动选取的问题,模拟显示该方法能有效识别回归函数且估计稳健。
A new procedure is proposed for selecting the input value for the adaptation bandwidth λ in nonparametric Gaussian regression via Propagation–Separation approach. Since λ stands at the bias–variance trade-off of this estimation technique, it is its key tuning parameter. So far, however, apart from theoretical concepts, there exists no approach for a data-driven choice of its input value. This complicates the practical application of the Propagation–Separation method despite its highly desirable statistical properties — it is dimension-free and very well suited for estimation problems where the underlying regression function has sharp discontinuities and large homogeneous regions. The proposed selection method is based on the idea of cross-validation. Therefore, it allows to choose λ in a data-driven way. Its performance is evaluated via simulations. The results are very convincing: Cross-validation is a very transparent selection procedure that provides for each sample a “tailor-made” input value for λ and allows a successful identification of the underlying regression function. As the sample size increases, the accuracy of estimation improves in the sense that the estimates approach the true regression function. In addition, cross-validation accounts for the weighting functions and other parameter values used within the Propagation–Separation method by adjusting λ accordingly, which results in very robust estimates.