Privacy protection, measurement error, and the integration of remote sensing and socioeconomic survey data
研究了世界银行LSMS-ISA调查中空间匿名化方法在整合遥感天气数据时引入的测量误差,发现常用方法对天气与农业生产力关系估计影响有限,但误差程度取决于所选遥感产品。
When publishing socioeconomic survey data, survey programs implement a variety of statistical methods designed to preserve privacy but which come at the cost of distorting the data. We explore the extent to which spatial anonymization methods to preserve privacy in the large-scale surveys supported by the World Bank Living Standards Measurement Study-Integrated Surveys on Agriculture (LSMS-ISA) introduce measurement error in econometric estimates when that survey data is integrated with remote sensing weather data. Guided by a pre-analysis plan, we produce 90 linked weather-household datasets that vary by the spatial anonymization method and the remote sensing weather product. By varying the data along with the econometric model we quantify the magnitude and significance of measurement error coming from the loss of accuracy that results from privacy protection measures. We find that spatial anonymization techniques currently in general use have, on average, limited to no impact on estimates of the relationship between weather and agricultural productivity. However, the degree to which spatial anonymization introduces mismeasurement is a function of which remote sensing weather product is used in the analysis. We conclude that care must be taken in choosing a remote sensing weather product when looking to integrate it with publicly available survey data.