Statistical calibration for infinite many future values in linear regression: simultaneous or pointwise tolerance intervals or what else?
本文针对线性回归中无穷多未来y值对应的x值置信集,指出基于同时容忍区间的方法过于保守,而基于逐点容忍区间的方法存在误解,并提出了加权同时容忍区间,能精确满足关键性质,且在实际数据中表现更优。
Abstract Statistical calibration using regression is a useful statistical tool with many applications. For confidence sets for x-values associated with infinitely many future y-values, there is a consensus in the statistical literature that the confidence sets constructed should guarantee a key property. While it is well known that the confidence sets based on the simultaneous tolerance intervals (STIs) guarantee this key property conservatively, it is desirable to construct confidence sets that satisfy this property exactly. Also, there is a misconception that the confidence sets based on the pointwise tolerance intervals (PTIs) also guarantee this property. This paper constructs the weighted simultaneous tolerance intervals (WSTIs) so that the confidence sets based on the WSTIs satisfy this property exactly if the future observations have the x-values distributed according to a known specific distribution F(⋅). Through the lens of the WSTIs, convincing counter examples are also provided to demonstrate that the confidence sets based on the PTIs do not guarantee the key property in general and so should not be used. The WSTIs have been applied to real data examples to show that the WSTIs can produce more accurate calibration intervals than STIs and PTIs.