General purpose multiply robust data integration procedures for handling nonprobability samples
提出一种结合多个结果回归模型和倾向得分模型的数据整合方法,用于估计总体参数,在只有一个模型设定正确时仍保持一致性,并通过模拟和韩国健康调查数据验证了其减少偏差和提高效率的优势。
Abstract In recent years, there has been an increased interest in combining probability and nonprobability samples. Nonprobability sample are cheaper and quicker to conduct but the resulting estimators are vulnerable to bias as the participation probabilities are unknown. To adjust for the potential bias, estimation procedures based on parametric or nonparametric models have been discussed in the literature. However, the validity of the resulting estimators relies heavily on the validity of the underlying models. Also, nonparametric approaches may suffer from the curse of dimensionality and poor efficiency. We propose a data integration approach by combining multiple outcome regression models and propensity score models. The proposed approach can be used for estimating general parameters including totals, means, distribution functions, and percentiles. The resulting estimators are multiply robust in the sense that they remain consistent if all but one model are misspecified. The asymptotic properties of point and variance estimators are established. The results from a simulation study show the benefits of the proposed method in terms of bias and efficiency. Finally, we apply the proposed method using data from the Korea National Health and Nutrition Examination Survey and data from the National Health Insurance Sharing Services.