🌙

增强在线平台虚假内容检测:一种基于对抗训练的领域自适应迁移学习方法

Augmenting fake content detection in online platforms: A domain adaptive transfer learning via adversarial training approach

Production and Operations Management · 2023
被引 32
人大 AFT50UTD24ABS 4

中文导读

提出一种基于对抗训练的领域自适应迁移学习方法,利用源领域(一般新闻)的标注数据学习语言特征,提升目标领域(政治新闻、金融新闻、在线评论)中标注数据稀缺时的虚假内容检测效果。

Abstract

Online platforms are experimenting with interventions such as content screening to moderate the effects of fake, biased, and incensing content. Yet, online platforms face an operational challenge in implementing machine learning algorithms for managing online content due to the labeling problem, where labeled data used for model training are limited and costly to obtain. To address this issue, we propose a domain adaptive transfer learning via adversarial training approach to augment fake content detection with collective human intelligence. We first start with a source domain dataset containing deceptive and trustworthy general news constructed from a large collection of labeled news sources based on human judgments and opinions. We then extract discriminating linguistic features commonly found in source domain news using advanced deep learning models. We transfer these features associated with the source domain to augment fake content detection in three target domains: political news, financial news, and online reviews. We show that domain invariant linguistic features learned from a source domain with abundant labeled examples can effectively improve fake content detection in a target domain with very few or highly unbalanced labeled data. We further show that these linguistic features offer the most value when the level of transferability between source and target domains is relatively high. Our study sheds light on the platform operation in managing online content and resources when applying machine learning for fake content detection. We also outline a modular architecture that can be adopted in developing content screening tools in a wide spectrum of fields.

虚假内容检测迁移学习对抗训练在线平台自然语言处理