Distributed Nonparametric Regression with Heterogeneity Through Prediction-Based Aggregation
提出一种数据驱动的加权聚合方法,利用平方预测误差矩阵实现通信高效的分布式非参数回归,适用于异质性环境,并通过模拟和心率数据验证。
Distributed statistical modeling is a powerful tool for dealing with large-scale datasets while maintaining data privacy. In this study, we propose a data-driven weighted aggregation procedure that leverages model prediction performance and is adaptable to heterogeneous distributed environments. The proposed procedures utilize the squared prediction error matrix as the main transmitted quantity, with its dimension being the square of the number of workers, ensuring communication efficiency. We show that the proposed estimates have asymptotical optimal weights in terms of quadratic loss and corresponding risk. The limits of data-driven weights are also derived. We also study the minimax property of the proposed nonparametric function estimates. To examine the finite sample performance of the proposed procedure, we conduct Monte Carlo simulation studies. Furthermore, we illustrate the proposed methodology via an empirical analysis of a real-world dataset on heart rate prediction.