Data Quality Measures for Computational Research: Ensuring Informed Decisions with Emerging Data Sources
本文回顾了广告数据的历史,总结了总调查误差和效度方法,提出了审计效度和规范效度两个新标准,并为广告学者提供了评估计算广告数据质量的清单。
The proliferation of computational advertising (CA) and other technological developments in artificial intelligence have greatly expanded the types of data used in advertising research, thereby creating new data types. The advertising community needs ways to evaluate the quality of CA data. Although traditional frameworks for evaluating quality are still relevant, they must be updated for these new conditions. Data quality discussions are actively occurring in other fields, including marketing, machine learning, and computational social science. This article provides a short history of advertising data and a summary of the total survey error (TSE) and validity approaches to quality. Three approaches—collaborative, independent, and synthetic—for advertising scholars to access CA data are identified and discussed. This article reviews how TSE and validity can be used to evaluate newer CA data situations and provides a bridge between CA data terminology and quality considerations discussed in different literatures. It proposes and develops two new quality criteria: audit validity, referring to whether the data can be independently validated and the findings replicated, and normative validity, which describes the ethical and responsible collection and use of data, avoiding harms and preserving privacy of individuals involved. Finally, this article provides a checklist for advertising scholars.