多小才算够大？开放标注数据集与深度学习的发展

How small is big enough? Open labeled datasets and the development of deep learning

Industrial and Corporate Change · 2025

被引 1

人大 BABS 3

Daniel Souza · 米兰大学
Aldo Geuna · 都灵大学
Jeff Rodríguez · 经济合作与发展组织

中文导读

研究了开放标注数据集（如CIFAR-10）在深度学习技术科学领域中的关键作用，通过定性和定量分析揭示其规模、实例数和类别数对技术进步和早期文献引用的影响。

Abstract

Abstract We investigate the emergence of Deep Learning as a technoscientific field, emphasizing the role of open labeled datasets. Through qualitative and quantitative analyses, we evaluate the role of datasets like Canadian Institute of Advanced Research - 10 classes (CIFAR-10) in advancing computer vision and object recognition, which are central to the Deep Learning revolution. Our findings highlight CIFAR-10’s crucial role and enduring influence on the field, as well as its importance in teaching ML techniques. Results also indicate that dataset characteristics such as size, number of instances, and number of categories, were key factors. Econometric analysis confirms that CIFAR-10, a small-but-sufficiently large open dataset, played a significant and lasting role in technological advancements and had a major function in the development of the early scientific literature as shown by citation metrics.

深度学习计算机视觉开放数据集技术科学

阅读原文 ↗