Imputing Top‐Coded Income Data in Longitudinal Surveys*
提出一种利用多期信息改进纵向调查中高收入者截断收入插补精度的方法,并引入非参数经验贝叶斯插补法,将插补值的均方根误差降低19%至51%,有助于研究多年收入不平等。
Abstract The incomes of top earners are typically top‐coded in survey data. I show that the accuracy of imputed income values for top earners in longitudinal surveys can be improved significantly by incorporating information from multiple time periods into the imputation process in a simple way. Moreover, I introduce an innovative, nonparametric empirical Bayes imputation method that further improves imputation quality. I show that the empirical Bayes imputation method reduces the RMSE of imputed income values by 19–51% relative to standard approaches in the literature. I also illustrate the benefits of the empirical Bayes method for investigating multi‐year income inequality.