预测数据能否成为真实数据的可行替代?

Is Predicted Data a Viable Alternative to Real Data?

World Bank Economic Review · 2019
被引 7
人大 A-ABS 3

中文导读

研究通过双重抽样法,用预测数据替代部分真实数据以降低贫困和健康统计成本,发现该方法在多数情况下只能小幅节约成本,建议优先使用真实数据。

Abstract

Abstract It is costly to collect the household- and individual-level data that underlie official estimates of poverty and health. For this reason, developing countries often do not have the budget to update estimates of poverty and health regularly, even though these estimates are most needed there. One way to reduce the financial burden is to substitute some of the real data with predicted data by means of double sampling, where the expensive outcome variable is collected for a subsample and its predictors for all. This study finds that double sampling yields only modest reductions in financial costs when imposing a statistical precision constraint in a wide range of realistic empirical settings. There are circumstances in which the gains can be more substantial, but these denote the exception rather than the rule. The recommendation is to rely on real data whenever there is a need for new data and to use prediction estimators to leverage existing data.

双重抽样预测数据贫困估计健康统计