安全检验

Safe testing

Journal of the Royal Statistical Society. Series B: Statistical Methodology · 2024

被引 64 · 同刊同年前 1%

ABS 4

Peter Grünwald 通讯
Rianne de Heide
Wouter M. Koolen

中文导读

本文提出基于e值的假设检验理论，e值像p值但允许在结果依赖决策时合并研究，保证错误率可控，并定义了最优增长率作为功效的替代，适用于含复合假设和干扰参数的检验问题。

Abstract

Abstract We develop the theory of hypothesis testing based on the e-value, a notion of evidence that, unlike the p-value, allows for effortlessly combining results from several studies in the common scenario where the decision to perform a new study may depend on previous outcomes. Tests based on e-values are safe, i.e. they preserve type-I error guarantees, under such optional continuation. We define growth rate optimality (GRO) as an analogue of power in an optional continuation context, and we show how to construct GRO e-variables for general testing problems with composite null and alternative, emphasizing models with nuisance parameters. GRO e-values take the form of Bayes factors with special priors. We illustrate the theory using several classic examples including a 1-sample safe t-test and the 2×2 contingency table. Sharing Fisherian, Neymanian, and Jeffreys–Bayesian interpretations, e-values may provide a methodology acceptable to adherents of all three schools.

统计学假设检验贝叶斯统计计量经济学机器学习

阅读原文 ↗