披露情绪:机器学习与词典方法的比较

Disclosure Sentiment: Machine Learning vs. Dictionary Methods

Management Science · 2021
被引 41
人大 A+FT50UTD24ABS 4*

中文导读

比较了基于词典和机器学习的方法在10-K文件和电话会议日期捕捉披露情绪的能力,发现机器学习方法在解释回报方面显著优于词典方法,且随机森林回归树效果最好。

Abstract

We compare the ability of dictionary-based and machine-learning methods to capture disclosure sentiment at 10-K filing and conference-call dates. Like Loughran and McDonald [Loughran T, McDonald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J. Finance 66(1):35–65.], we use returns to assess sentiment. We find that measures based on machine learning offer a significant improvement in explanatory power over dictionary-based measures. Specifically, machine-learning measures explain returns at 10-K filing dates, whereas measures based on the Loughran and McDonald dictionary only explain returns at 10-K filing dates during the time period of their study. Moreover, at conference-call dates, machine-learning methods offer an improvement over the Loughran and McDonald dictionary method of a greater magnitude than the improvement of the Loughran and McDonald dictionary over the Harvard Psychosociological Dictionary. We further find that the random-forest-regression-tree method better captures disclosure sentiment than alternative algorithms, simplifying the application of the machine-learning approach. Overall, our results suggest that machine-learning methods offer an easily implementable, more powerful, and reliable measure of disclosure sentiment than dictionary-based methods. This paper was accepted by Brian Bushee, accounting.

披露情绪机器学习词典方法随机森林回归树