Semi-supervised distribution learning
研究半监督设置下的分布估计与推断,提出利用无标签数据近似条件分布的框架,估计量具有一致性和渐近高斯过程性质,在渐近效率上优于经验累积分布函数。
Abstract This study addresses the challenge of distribution estimation and inference in a semi-supervised setting. In contrast to prior research focusing on parameter inference, this work explores the complexities of semi-supervised distribution estimation, particularly the uniformity problem inherent in functional processes. To tackle this issue, we introduce a versatile framework designed to extract valuable information from unlabelled data by approximating a conditional distribution on covariates. The proposed estimator is derived using K-fold cross-fitting, and exhibits both consistency and asymptotic Gaussian process properties. Under mild conditions, the proposed estimator outperforms the empirical cumulative distribution function in terms of asymptotic efficiency. Several applications of the methodology are given, including parameter inference and goodness-of-fit tests.