Efficient estimation with missing data and endogeneity
研究线性模型中结果变量和内生协变量缺失的问题,提出一种比完整样本两阶段最小二乘更高效的估计量,可处理内生协变量的非线性函数,并用于合并不同缺失模式的数据集。
I study the problem of missing values in the outcome and endogenous covariates in linear models. I propose an estimator that improves efficiency relative to a complete cases 2SLS. Unlike traditional imputation, my estimator is consistent even if the model contains nonlinear functions – like squares and interactions – of the endogenous covariates. It can also be used to combine data sets with missing outcome, missing endogenous covariates, and no missing variables. It includes the well-known “Two-Sample 2SLS” as a special case under weaker assumptions than the corresponding literature.