Accounting Variables, Deception, and a Bag of Words: Assessing the Tools of Fraud Detection
基于年报和中期报告管理层讨论与分析部分的语言,开发了一个概率指标来区分欺诈与真实报告,并与八种其他检测工具进行了比较,证明该语言方法在横截面和时间序列中均有效。
Abstract We develop a data‐generated tool for distinguishing between fraudulent and truthful reports based on the language used in the management discussion and analysis section of annual and interim reports. Using this method, we are able to assign a probability of truth to each report which is then shown to be an effective indicator of fraud. Our work goes beyond the development of a tool alone, however, by conducting an extensive comparison of our probability‐of‐truth measure with eight alternative detection tools representing both quantitative and language‐based approaches. Comparisons are made across a variety of samples and show that our language‐based approach can be effective in both cross‐sectional and time‐series settings. It is useful both in distinguishing between fraudulent and truthful firms and in identifying fraudulent reports from a series of reports issued by a single firm. This second setting is one in which accounting‐based detection tools have frequently struggled. We establish that, not only is our probability‐of‐truth measure significantly associated with fraud, so too is the change in this measure from a firm's previous reports. Prior reports may serve an important benchmarking role in using language‐based tools to identify fraud.