关于农业经济学中用于复制目的的合成数据的说明

A note on synthetic data for replication purposes in agricultural economics

Journal of Agricultural Economics · 2022
被引 11
人大 A-ABS 3

中文导读

探讨了在农业经济学中使用合成数据替代原始数据以促进研究复制的可行性,通过比较原始数据与合成数据在投入弹性和技术效率上的相似性,发现基于CART方法生成的合成数据效果最佳。

Abstract

Abstract Empirical studies in agricultural economics usually involve policy implications. In many cases, such studies rely on proprietary or confidential data that cannot be published along with the article, challenging the replicability and credibility of the results. To overcome this problem, the use of synthetic data—that is, data that do not contain a single unit of the original data—has been proposed. In this note, we illustrate the utility of synthetic data generation methods for replication purposes using a range of methods from agricultural production analysis. More specifically, we compare input elasticities and technical efficiency scores based on different farm‐level production data between original data and synthetic data. We generate synthetic data using a non‐parametric method of classification and regression trees (CART) and parametric linear regressions. We find synthetic data result in elasticities and technical efficiency distributions that are very similar to the original data, especially when generated with CART, and conclude with implications for the research community.

合成数据复制研究农业经济学分类回归树