Efficient and multiply robust risk estimation under general forms of dataset shift
研究了在多种数据集偏移条件下,如何利用辅助人群高效估计目标人群风险,提出了高效且多重稳健的估计量,并给出了偏移条件的规范检验方法。
. Despite extensive literature on dataset shift, limited works address how to efficiently use the auxiliary populations to improve the accuracy of risk evaluation for a given machine learning task in the target population. In this paper, we study the general problem of efficiently estimating target population risk under various dataset shift conditions, leveraging semiparametric efficiency theory. We consider a general class of dataset shift conditions, which includes three popular conditions-covariate, label and concept shift-as special cases. We allow for partially nonoverlapping support between the source and target populations. We develop efficient and multiply robust estimators along with a straightforward specification test of these dataset shift conditions. We also derive efficiency bounds for two other dataset shift conditions, posterior drift and location-scale shift. Simulation studies support the efficiency gains due to leveraging plausible dataset shift conditions.