🌙

SEntFiN 1.0:面向金融新闻的实体感知情感分析

SEntFiN 1.0: Entity‐aware sentiment analysis for financial news

Journal of the Association for Information Science and Technology (JASIST) · 2022
被引 66 · 同刊同年前 4%
ABS 3

中文导读

该研究发布了包含10,753条新闻标题的人工标注情感数据集SEntFiN 1.0,其中2,847条含多个实体且情感冲突,并提出了基于特征而非表达的情感提取框架,实验显示RoBERTa和finBERT准确率达94.29%。

Abstract

Abstract Fine‐grained financial sentiment analysis on news headlines is a challenging task requiring human‐annotated datasets to achieve high performance. Limited studies have tried to address the sentiment extraction task in a setting where multiple entities are present in a news headline. In an effort to further research in this area, we make publicly available SEntFiN 1.0, a human‐annotated dataset of 10,753 news headlines with entity‐sentiment annotations, of which 2,847 headlines contain multiple entities, often with conflicting sentiments. We augment our dataset with a database of over 1,000 financial entities and their various representations in news media amounting to over 5,000 phrases. We propose a framework that enables the extraction of entity‐relevant sentiments using a feature‐based approach rather than an expression‐based approach. For sentiment extraction, we utilize 12 different learning schemes utilizing lexicon‐based and pretrained sentence representations and five classification approaches. Our experiments indicate that lexicon‐based N‐gram ensembles are above par with pretrained word embedding schemes such as GloVe. Overall, RoBERTa and finBERT (domain‐specific BERT) achieve the highest average accuracy of 94.29% and F1‐score of 93.27%. Further, using over 210,000 entity‐sentiment predictions, we validate the economic effect of sentiments on aggregate market movements over a long duration.

金融情感分析自然语言处理新闻文本挖掘实体识别