访问在线数据:自动化研究数据收集的网络爬取与信息抓取技术

Accessing Online Data: Web‐Crawling and Information‐Scraping Techniques to Automate the Assembly of Research Data

JOURNAL OF BUSINESS LOGISTICS · 2016
被引 37
人大 A-ABS 3

中文导读

为供应链管理研究者提供网络数据收集的入门指南,涵盖定义、概念、实例、代码、性能优化及伦理责任,帮助不熟悉该技术的研究者起步。

Abstract

There is a growing interest in leveraging alternate sources of empirical data, with an increasing emphasis being placed on the Internet. This paper serves as a primer for supply chain management (SCM) researchers that may be interested in leveraging Internet‐based sources for their own research, but perhaps not familiar with how to begin. Here, definitions and concepts critical to successful implementation in practice are provided. In addition, concrete, discipline‐relevant examples accompany the discussion, and are aided by a fully detailed online code supplement. Performance enhancements are discussed, as well as associated caveats and limitations. Additionally, insights and guidance are offered on the unique responsibilities for researchers to uphold the ethical spirit of scientific research when continuing along these paths. Pragmatic issues related to the application of these techniques are presented for consideration of individual researchers and the SCM community as a whole.

供应链管理数据科学网络爬虫研究方法