🌙

联邦学习中隐私保护与统计效率的权衡

Privacy Protection and Statistical Efficiency Trade-Off for Federated Learning

INFORMS journal on computing · 2025
被引 1 · 同刊同年前 10%
人大 BUTD24ABS 3

中文导读

研究了联邦学习中集成差分隐私后统计效率与隐私保护的权衡关系,基于线性回归模型和噪声梯度下降算法,提出了能兼顾统计效率与隐私保护的Polyak-Ruppert型平均估计量,并通过模拟和企业数据验证。

Abstract

Federated learning is a novel framework for distributed learning, which aims to break isolated data islands, as well as protect data privacy. To further prevent privacy leakage by specially crafted attacks, differential privacy is often integrated. Although differential privacy effectively secures sensitive information, it can reduce the statistical efficiency of the resulting estimators. This leads to a trade-off relationship between statistical efficiency and privacy protection. To theoretically understand this relationship, we start with the classic linear regression model and a noise-adding federated gradient descent algorithm. Its numerical convergence properties and asymptotic properties are rigorously studied. This results in fruitful insights into the trade-off relationship between statistical efficiency and privacy protection. Guided by these theoretical understandings, we further develop a Polyak-Ruppert-type averaged estimator, which can achieve good statistical efficiency with guaranteed privacy protection. Extensive simulation studies are presented to corroborate our theoretical results. Finally, we illustrate the application of our proposed method on an enterprise community data set. History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning. Funding: Financial support from the National Natural Science Foundation of China [Grants 12401386, 72371241, 72495123, and 12271012], the Ministry of Education Project of Key Research Institute of Humanities and Social Sciences [Grant 22JJD910001], the Postdoctoral Fellowship Program of China Postdoctoral Science Foundation [Grant GZB20230070], and the Beijing Municipal Social Science Foundation [Grant 24GLC033] is gratefully acknowledged. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0554 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2024.0554 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .

联邦学习差分隐私统计效率线性回归分布式学习