无模型条件特征筛选与FDR控制

Model-Free Conditional Feature Screening with FDR Control

Journal of the American Statistical Association · 2022
被引 26
ABS 4

中文导读

提出一种无需模型假设的条件特征筛选方法,能控制错误发现率,适用于超高维数据,对厚尾分布稳健,并通过模拟和实际数据验证了有效性。

Abstract

In this article, we propose a model-free conditional feature screening method with false discovery rate (FDR) control for ultra-high dimensional data. The proposed method is built upon a new measure of conditional independence. Thus, the new method does not require a specific functional form of the regression function and is robust to heavy-tailed responses and predictors. The variables to be conditional on are allowed to be multivariate. The proposed method enjoys sure screening and ranking consistency properties under mild regularity conditions. To control the FDR, we apply the Reflection via Data Splitting method and prove its theoretical guarantee using martingale theory and empirical process techniques. Simulated examples and real data analysis show that the proposed method performs very well compared with existing works. Supplementary materials for this article are available online.

高维数据特征筛选错误发现率条件独立性