变量选择后回归中的有效置信区间

VALID CONFIDENCE INTERVALS IN REGRESSION AFTER VARIABLE SELECTION

Econometric Theory · 1998
被引 47
人大 A-ABS 4

中文导读

研究了线性回归中变量选择后如何构造参数θ1的置信区间,在保证最小覆盖概率1-α的前提下尽量缩短区间长度,并与标准区间比较,发现当预期|θp|/σ较小时选择后区间更优。

Abstract

We consider a linear regression model with regression parameters (θ 1 ,...,θ p ) and error variance parameter σ 2 . Our aim is to find a confidence interval with minimum coverage probability 1 − α for a parameter of interest θ 1 in the presence of nuisance parameters (θ 2 ,...,θ p ,σ 2 ). We consider two confidence intervals, the first of which is the standard confidence interval for θ 1 with coverage probability 1 − α. The second confidence interval for θ 1 is obtained after a variable selection procedure has been applied to θ p . This interval is chosen to be as short as possible subject to the constraint that it has minimum coverage probability 1 − α. The confidence intervals are compared using a risk function that is defined as a scaled version of the expected length of the confidence interval. We show that, subject to certain conditions including that [(dimension of response vector) − p ] is small, the second confidence interval is preferable to the first when we anticipate (without being certain) that |θ p |/σ is small. This comparison of confidence intervals is shown to be mathematically equivalent to a corresponding comparison of prediction intervals.

变量选择后回归置信区间覆盖概率期望长度