🌙

最大化最小效应的统计推断:识别跨多项研究的稳定关联

Statistical Inference for Maximin Effects: Identifying Stable Associations across Multiple Studies

Journal of the American Statistical Association · 2023
被引 7
ABS 4

中文导读

本文提出一种新方法,对多项研究中的最大化最小效应进行统计推断,构建置信区间,帮助识别跨群体稳定的关联,并用酵母遗传数据验证其在新环境中的可推广性。

Abstract

Integrative analysis of data from multiple sources is critical to making generalizable discoveries. Associations consistently observed across multiple source populations are more likely to be generalized to target populations with possible distributional shifts. In this paper, we model the heterogeneous multi-source data with multiple high-dimensional regressions and make inferences for the maximin effect (Meinshausen, Bühlmann, AoS, 43(4), 1801–1830). The maximin effect provides a measure of stable associations across multi-source data. A significant maximin effect indicates that a variable has commonly shared effects across multiple source populations, and these shared effects may be generalized to a broader set of target populations. There are challenges associated with inferring maximin effects because its point estimator can have a non-standard limiting distribution. We devise a novel sampling method to construct valid confidence intervals for maximin effects. The proposed confidence interval attains a parametric length. This sampling procedure and the related theoretical analysis are of independent interest for solving other non-standard inference problems. Using genetic data on yeast growth in multiple environments, we demonstrate that the genetic variants with significant maximin effects have generalizable effects under new environments. The proposed method is implemented in the R package MaximinInfer available from CRAN.

整合分析高维回归统计推断遗传学