非线性模型中的过拟合测量:一种新方法及其在医疗支出中的应用

MEASURING OVERFITTING IN NONLINEAR MODELS: A NEW METHOD AND AN APPLICATION TO HEALTH EXPENDITURES

Health Economics · 2013
被引 45
人大 A-

中文导读

提出一种新的过拟合测量方法,解决现有方法在非线性模型中因样本内偏差导致的混淆问题,并通过模拟和医疗支出数据验证其有效性。

Abstract

When fitting an econometric model, it is well known that we pick up part of the idiosyncratic characteristics of the data along with the systematic relationship between dependent and explanatory variables. This phenomenon is known as overfitting and generally occurs when a model is excessively complex relative to the amount of data available. Overfitting is a major threat to regression analysis in terms of both inference and prediction. We start by showing that the Copas measure becomes confounded by shrinkage or expansion arising from in-sample bias when applied to the untransformed scale of nonlinear models, which is typically the scale of interest when assessing behaviors or analyzing policies. We then propose a new measure of overfitting that is both expressed on the scale of interest and immune to this problem. We also show how to measure the respective contributions of in-sample bias and overfitting to the overall predictive bias when applying an estimated model to new data. We finally illustrate the properties of our new measure through both a simulation study and a real-data illustration based on inpatient healthcare expenditure data, which shows that the distinctions can be important.

过拟合度量非线性模型样本内偏差医疗支出预测