碎片数据预测的模型平均方法

Model Averaging for Prediction With Fragmentary Data

Journal of Business & Economic Statistics · 2017

被引 38

人大 AABS 4

Fang Fang · 华东师范大学
Wei Lan · 西南财经大学
Jingjing Tong · 华东师范大学
Jun Shao · 威斯康星大学麦迪逊分校

中文导读

针对多源数据中许多样本协变量缺失的碎片数据，提出基于频率学派模型平均的预测方法，通过留一交叉验证选择权重，并用中国财富管理社区的个人收入数据验证效果。

Abstract

One main challenge for statistical prediction with data from multiple sources is that not all the associated covariate data are available for many sampled subjects. Consequently, we need new statistical methodology to handle this type of “fragmentary data” that has become more and more popular in recent years. In this article, we propose a novel method based on the frequentist model averaging that fits some candidate models using all available covariate data. The weights in model averaging are selected by delete-one cross-validation based on the data from complete cases. The optimality of the selected weights is rigorously proved under some conditions. The finite sample performance of the proposed method is confirmed by simulation studies. An example for personal income prediction based on real data from a leading e-community of wealth management in China is also presented for illustration.

碎片数据模型平均交叉验证预测

免费全文 ↗阅读原文 ↗