最高后验模型选择的深入探讨

AN IN-DEPTH LOOK AT HIGHEST POSTERIOR MODEL SELECTION

Econometric Theory · 2007
被引 12
人大 A-ABS 4

中文导读

研究了线性回归中最高后验概率模型的性质,发现其存在有限样本下欠拟合和先验影响过大的问题,并提出重缩放尖峰与板层层次结构来缓解这些缺陷。

Abstract

We consider the properties of the highest posterior probability model in a linear regression setting. Under a spike and slab hierarchy we find that although highest posterior model selection is total risk consistent, it possesses hidden undesirable properties. One such property is a marked underfitting in finite samples, a phenomenon well noted for Bayesian information criterion (BIC) related procedures but not often associated with highest posterior model selection. Another concern is the substantial effect the prior has on model selection. We employ a rescaling of the hierarchy and show that the resulting rescaled spike and slab models mitigate the effects of underfitting because of a perfect cancellation of a BIC-like penalty term. Furthermore, by drawing upon an equivalence between the highest posterior model and the median model, we find that the effect of the prior is less influential on model selection, as long as the underlying true model is sparse. Nonsparse settings are, however, problematic. Using the posterior mean for variable selection instead of posterior inclusion probabilities avoids these issues.

最高后验模型选择尖峰-板层先验模型欠拟合变量选择