A Semiparametric Mixture Approach to Case-Control Studies With Errors in Covariables
针对病例对照研究中协变量存在测量误差的问题,提出一种半参数混合模型,利用验证子样本和主样本数据,通过非参数混合分布建模真实协变量的边际分布,提高参数估计效率并减少偏误。
Abstract Methods are devised for estimating the parameters of a prospective logistic model in a case-control study with dichotomous response D that depends on a covariate X. For a portion of the sample, both the gold standard X and a surrogate covariate W are available; however, for the greater portion of the data, only the surrogate covariate W is available. By using a mixture model, the relationship between the true covariate and the response can be modeled appropriately for both types of data. The likelihood depends on the marginal distribution of X and the measurement error density (W|X, D). The latter is modeled parametrically based on the validation sample. The marginal distribution of the true covariate is modeled using a nonparametric mixture distribution. In this way we can improve the efficiency and reduce the bias of the parameter estimates. The results also apply when there is no validation data provided the error distribution is known or estimated from an independent data source. Many of the results also apply to the easier case of prospective sampling.