🌙

加性效应辅助学习

Additive-Effect Assisted Learning

Journal of the Royal Statistical Society. Series B: Statistical Methodology · 2025
被引 0
ABS 4

中文导读

针对不同数据持有者因隐私和通信限制难以协作的问题,提出两阶段辅助学习架构,使Alice能通过少量传输判断Bob数据是否有用,并联合训练达到集中式数据下的最优性能。

Abstract

Abstract It is quite popular nowadays for researchers and data analysts holding different datasets to seek assistance from each other to enhance their modelling performance. We consider a scenario where different learners hold datasets with potentially distinct variables, and their observations can be aligned by a nonprivate identifier. Their collaboration faces the following difficulties: first, learners may need to keep data values or even variable names undisclosed due to, e.g. commercial interest or privacy regulations; second, there are restrictions on the number of transmission rounds between them due to e.g. communication costs. To address these challenges, we develop a two-stage assisted learning architecture for an agent, Alice, to seek assistance from another agent, Bob. In the first stage, we propose a privacy-aware hypothesis testing-based screening method for Alice to decide on the usefulness of the data from Bob, in a way that only requires Bob to transmit sketchy data. Once Alice recognizes Bob’s usefulness, Alice and Bob move to the second stage, where they jointly apply a synergistic iterative model training procedure. With limited transmissions of summary statistics, we show that Alice can achieve the oracle performance as if the training were from centralized data, both theoretically and numerically.

机器学习隐私保护分布式学习假设检验