🌙

使用情节摘要的主题模型进行电影类型分析

Genre analysis of movies using a topic model of plot summaries

Journal of the Association for Information Science and Technology (JASIST) · 2021
被引 21
ABS 3

中文导读

该研究通过无监督主题模型分析大量电影情节摘要,探讨了电影类型的可识别性、构成、典型性及随时间的变化,发现许多类型可通过词汇特征预测,且西部片、科幻片等类型构成随时间显著变化。

Abstract

Abstract Genre plays an important role in the description, navigation, and discovery of movies, but it is rarely studied at large scale using quantitative methods. This allows an analysis of how genre labels are applied, how genres are composed and how these ingredients change, and how genres compare. We apply unsupervised topic modeling to a large collection of textual movie summaries and then use the model's topic proportions to investigate key questions in genre, including recognizability, mapping, canonicity, and change over time. We find that many genres can be quite easily predicted by their lexical signatures and this defines their position on the genre landscape. We find significant genre composition changes between periods for westerns, science fiction and road movies, reflecting changes in production and consumption values. We show that in terms of canonicity, canonical examples are often at the high end of the topic distribution profile for the genre rather than central as might be predicted by categorization theory.

电影类型自然语言处理主题模型文本分析