Comparing Predictive Accuracy, Twenty Years Later: A Personal Perspective on the Use and Abuse of Diebold–Mariano Tests
回顾Diebold-Mariano检验的初衷是用于比较预测而非模型,指出在伪样本外环境中滥用该检验的问题,并强调全样本模型比较方法更优。
The Diebold-Mariano ( ) test was intended for comparing forecasts; it has been, and remains, useful in that regard. The test was not intended for comparing models. Much of the large ensuing literature, however, uses -type tests for comparing models, in pseudo-out-of-sample environments. In that case, simpler yet more compelling full-sample model comparison procedures exist; they have been, and should continue to be, widely used. The hunch that pseudo-out-of-sample analysis is somehow the "only, " or "best, " or even necessarily a "good" way to provide insurance against in-sample overfitting in model comparisons proves largely false. On the other hand, pseudo-out-of-sample analysis remains useful for certain tasks, perhaps most notably for providing information about comparative predictive performance during particular historical episodes.