Pitfalls of amateur regression: The Dutch New Herring controversies
一位经济学家用简单线性回归分析荷兰新鲱鱼排名数据后指控排名被操纵,引发公众愤怒和调查终止。本文重新分析数据,指出其推理错误,并说明所谓操纵证据可能是模型设定误差的产物。
Abstract Applying simple linear regression models, an economist analyzed a published dataset from an influential annual ranking in 2016 and 2017 of consumer outlets for Dutch New Herring and concluded that the ranking was manipulated. His finding was promoted by his university in national and international media, and this led to public outrage and ensuing discontinuation of the survey. We reconstitute the dataset, correcting errors and exposing features already important in a descriptive analysis of the data. The economist has continued his investigations, and in a follow‐up publication repeats the same accusations. We point out errors in his reasoning and show that alleged evidence for deliberate manipulation of the ranking could easily be an artifact of specification errors. Temporal and spatial factors are both important and complex, and their effects cannot be captured using simple models, given the small sample sizes and many factors determining perceived taste of a food product.