🌙

数据整合:利用异质人群的外部信息与Oracle方法

Data Integration with Oracle Use of External Information from Heterogeneous Populations

Journal of Computational and Graphical Statistics · 2022
被引 36 · 同刊同年前 4%
ABS 3

中文导读

提出一种带惩罚的约束最大似然方法,能从多个外部研究中自动筛选有用信息并整合到内部模型估计中,达到与仅使用有用外部信息的Oracle估计相同的效率。

Abstract

It is common to have access to summary information from external studies. Such information can be useful for an internal study of interest to improve parameter estimation efficiency when incorporated. However, external studies may target populations different from the internal study, in which case an incorporation of the corresponding information may introduce estimation bias. We develop a penalized constrained maximum likelihood (PCML) method that simultaneously (a) selects the external studies whose information is useful for internal model fitting and (b) incorporates the corresponding information into internal estimation. The PCML estimator has the same efficiency as an oracle estimator that fully incorporates the useful external information alone. We establish estimation consistency, parametric rate of convergence, external information selection consistency, asymptotic normality, and oracle efficiency. An algorithm for implementation is provided, together with a data-adaptive tuning parameter selection. Supplemental materials are available online containing some details referred to throughout the article.

计量经济学统计推断机器学习数据融合