Regression with an imputed dependent variable
研究两个变量关系时,若两变量分属不同数据集,常用插补法补全因变量。本文指出常见插补法会导致估计不一致,并提出一种一致且易操作的两步估计量“重标回归预测”,推导了正确渐近标准误,并用美国消费支出调查和收入动态面板数据举例说明。
Summary Researchers are often interested in the relationship between two variables, with no single data set containing both. A common strategy is to use proxies for the dependent variable that are common to two surveys to impute the dependent variable into the data set containing the independent variable. We show that commonly employed regression or matching‐based imputation procedures lead to inconsistent estimates. We offer a consistent and easily implemented two‐step estimator, “rescaled regression prediction.” We derive the correct asymptotic standard errors for this estimator and demonstrate its relationship to alternative approaches. We illustrate with empirical examples using data from the US Consumer Expenditure Survey (CE) and the Panel Study of Income Dynamics (PSID).