统计显著性、p值与不确定性的报告

Statistical Significance,p-Values, and the Reporting of Uncertainty

Journal of Economic Perspectives · 2021

被引 128

人大 A-ABS 4

Guido W. Imbens 通讯

中文导读

讨论了统计显著性和p值在多个学科中的争议，指出它们常不回答决策问题，而点估计和置信区间更有用；但也承认在某些假设检验场景中p值合理，并批评了p值滥用问题。

Abstract

The use of statistical significance and p-values has become a matter of substantial controversy in various fields using statistical methods. This has gone as far as some journals banning the use of indicators for statistical significance, or even any reports of p-values, and, in one case, any mention of confidence intervals. I discuss three of the issues that have led to these often-heated debates. First, I argue that in many cases, p-values and indicators of statistical significance do not answer the questions of primary interest. Such questions typically involve making (recommendations on) decisions under uncertainty. In that case, point estimates and measures of uncertainty in the form of confidence intervals or even better, Bayesian intervals, are often more informative summary statistics. In fact, in that case, the presence or absence of statistical significance is essentially irrelevant, and including them in the discussion may confuse the matter at hand. Second, I argue that there are also cases where testing null hypotheses is a natural goal and where p-values are reasonable and appropriate summary statistics. I conclude that banning them in general is counterproductive. Third, I discuss that the overemphasis in empirical work on statistical significance has led to abuse of p-values in the form of p-hacking and publication bias. The use of pre-analysis plans and replication studies, in combination with lowering the emphasis on statistical significance may help address these problems.

统计显著性p值置信区间贝叶斯区间

阅读原文 ↗