🌙

超越新闻标题和TF-IDF:利用验证过的搭配和改进的注意力机制增强基于文本的预测模型

Beyond news headlines and TF-IDF: Enhancing text-based forecasting models with validated collocations and improved attention

International Journal of Forecasting · 2026
被引 0
ABS 3

中文导读

提出一种改进基于文本的预测模型的方法,通过验证动词-名词搭配模式和注意力机制,发现全文分析优于仅用标题,且搭配如“价格下跌”预示油价走低,结合宏观经济数据可进一步提升预测准确性。

Abstract

This paper proposes a method to improve text-based forecasting models, specifically for crude oil prices. Utilizing advanced techniques, including pattern validation and attention mechanisms, the study demonstrates notable improvements in predictive power over traditional approaches. A key finding is that considering the full text of news articles, rather than limiting the analysis to headlines, yields significant gains in forecasting accuracy. Furthermore, the model featuring verb-noun and noun-verb collocation pattern validation consistently outperforms benchmarks and models based solely on news headlines across various forecasting horizons. The results suggest that the presence of collocations such as ‘price fell’, ‘prices tumbled’, and ‘price dropped’ in crude-oil-related news articles is associated with lower oil price returns. Additionally, integrating macroeconomic data with text-based features enhances predictive performance, demonstrating that combining structured economic indicators with textual features improves forecasting accuracy.

原油价格预测文本挖掘自然语言处理经济预测