On the Distributional Properties of Model Selection Criteria
研究了模型选择中简约原则的量化问题,提出惩罚项应在1.5到5之间,并应用于交叉验证以指导数据划分。
Abstract It is commonly accepted that statistical modeling should follow the parsimony principle; namely, that simple models should be given priority whenever possible. But little quantitative knowledge is known concerning the amount of penalty (for complexity) regarded as allowable. We try to understand the parsimony principle in the context of model selection. In particular, the generalized final prediction error criterion is considered, and we argue that the penalty term should be chosen between 1.5 and 5 for most practical situations. Applying our results to the cross-validation criterion, we obtain insights into how the partition of data should be done. We also discuss the small sample performance of our methods.