Combining survey and census data for improved poverty prediction using semi-supervised deep learning
提出一种半监督深度学习方法,利用伪标签技术结合大量无标签普查数据与有限调查数据,在非洲多个区域将贫困预测AUC提升至0.8-0.9以上,优于传统方法,有助于精准识别贫困人口和优化资源分配。
This paper presents a methodology for predicting poverty using semi-supervised learning techniques, specifically pseudo-labeling, and deep learning algorithms. Standard poverty prediction models rely on limited household survey data, whereas our approach exploits large amounts of unlabeled census data to improve prediction accuracy. By applying pseudo-labeling, we improve key performance metrics across various African regions, where our models outperform conventional approaches to identifying poor individuals. Deep neural networks (DNNs) trained on pseudo-labeled data exhibited area under the curve (AUC) scores ranging from 0.8 to over 0.9, a notable improvement over previous machine learning survey-based methods. Furthermore, random undersampling was key to refining model performance, balancing higher coverage with some reduction in precision. These findings have significant implications for poverty targeting, enabling more accurate identification of poor individuals and supporting better resource allocation. • Semi-supervised learning techniques like pseudo-labeling leverage large amounts of unlabeled census data, outperforming traditional methods that rely on limited survey data for poverty prediction. • Pseudo-labeling improved key metrics across various African regions, demonstrating superior performance in predicting poverty among diverse populations. • Deep neural networks (DNN) trained on pseudo-labeled data surpassed traditional models, achieving AUC scores ranging from 0.8 to over 0.9. • Random under-sampling and Bayesian optimization were critical for improving the DNN model’s coverage and AUC, although this came with a trade-off between higher coverage and reduced precision. • Implications for poverty targeting include more accurate identification of poor individuals, leading to better resource allocation and more effective anti-poverty interventions.