使用保形预测和加权秩和的两样本条件分布检验

A Two-Sample Conditional Distribution Test Using Conformal Prediction and Weighted Rank Sum

Journal of the American Statistical Association · 2023
被引 12
ABS 4

中文导读

提出一种基于保形预测框架的非参数检验方法,用于检验两组数据中给定协变量后响应变量的条件分布是否相等,适用于高维大数据场景。

Abstract

We consider the problem of testing the equality of conditional distributions of a response variable given a vector of covariates between two populations. Such a hypothesis testing problem can be motivated from various machine learning and statistical inference scenarios, including transfer learning and causal predictive inference. We develop a nonparametric test procedure inspired from the conformal prediction framework. The construction of our test statistic combines recent developments in conformal prediction with a novel choice of conformity score, resulting in a weighted rank-sum test statistic that is valid and powerful under general settings. To our knowledge, this is the first successful attempt of using conformal prediction for testing statistical hypotheses beyond exchangeability. Our method is suitable for modern machine learning scenarios where the data has high dimensionality and large sample sizes, and can be effectively combined with existing classification algorithms to find good conformity score functions. The performance of the proposed method is demonstrated in various numerical examples. Supplementary materials for this article are available online.

统计假设检验非参数统计机器学习因果推断