线性回归中的变量选择

ON VARIABLE SELECTION IN LINEAR REGRESSION

Econometric Theory · 2002
被引 20
人大 A-ABS 4

中文导读

质疑了Shibata关于AIC优于BIC的结论,通过固定数据长度比较所有可能的数据生成机制,发现即使数据量很大,AIC也不优于BIC。

Abstract

Shibata (1981, Biometrika 68, 45–54) considers data-generating mechanisms belonging to a certain class of linear regressions with errors that are independent and identically normally distributed. He compares the variable selection criteria AIC (Akaike information criterion) and BIC (Bayesian information criterion) using the following type of comparison. For each fixed possible data–generating mechanism, these criteria are compared as the data length increases. The results of this comparison have been interpreted as meaning that, in the context of the data-generating mechanisms considered by Shibata, AIC is better than BIC for large data lengths. Shibata's comparison is pointwise in the space of data–generating mechanisms (as the data length increases). Such comparisons are potentially misleading. We consider a simple class of data-generating mechanisms satisfying Shibata's assumptions and carry out a different type of comparison. For each fixed data length (possibly large) we compare the variable selection criteria for every possible data-generating mechanism in this class. According to this comparison, for this class of data-generating mechanisms no matter how large the data length AIC is not better than BIC.

变量选择AICBIC线性回归