🌙

带一个和两个动量项的分布式梯度下降算法的收敛性分析

Convergence Analysis of Distributed Gradient Descent Algorithms With One and Two Momentum Terms

IEEE Transactions on Cybernetics · 2022
被引 9
ABS 3

中文导读

研究了在分布式网络中,添加一个或两个动量项对梯度下降算法收敛速度的影响,发现一个动量项可加速收敛,而两个动量项并无优势。

Abstract

For the centralized optimization, it is well known that adding one momentum term (also called the heavy-ball method) can obtain a faster convergence rate than the gradient method. However, for the distributed counterpart, there is quite few results about the effect of added momentum terms on the convergence rate. This article is aimed at studying the issue in the distributed setup, where N agents minimize the sum of their individual cost functions using local communication over a network. The cost functions are twice continuously differentiable. We first study the algorithm with one momentum term and develop a distributed heavy-ball (D-HB) method by adding one momentum term on to the distributed gradient algorithm. By borrowing tools from the control theory, we provide a simple convergence proof and an explicit expression of the optimal convergence rate. Furthermore, we consider adding two momentum terms case and propose a distributed double-heavy-ball (D-DHB) method. We show that adding one momentum term allows faster convergence while adding two momentum terms does not perform any superiorities. Finally, simulation examples are given to illustrate our findings.

分布式优化梯度下降算法动量方法收敛性分析