Polyak–Łojasiewicz不等式下具有随机定点舍入误差的梯度下降法的收敛性

On the convergence of the gradient descent method with stochastic fixed-point rounding errors under the Polyak–Łojasiewicz inequality

Computational Optimization and Applications · 2025

被引 2

ABS 3

Lu Xia 通讯
Stefano Massei
Michiel E. Hochstenbach

中文导读

研究了低精度定点运算下，随机舍入策略如何影响梯度下降法在满足Polyak–Łojasiewicz不等式问题中的收敛性，发现有偏随机舍入能消除梯度消失问题并收紧收敛率上界。

Abstract

Abstract In the training of neural networks with low-precision computation and fixed-point arithmetic, rounding errors often cause stagnation or are detrimental to the convergence of the optimizers. This study provides insights into the choice of appropriate stochastic rounding strategies to mitigate the adverse impact of roundoff errors on the convergence of the gradient descent method, for problems satisfying the Polyak–Łojasiewicz inequality. Within this context, we show that a biased stochastic rounding strategy may be even beneficial in so far as it eliminates the vanishing gradient problem and forces the expected roundoff error in a descent direction. Furthermore, we obtain a bound on the convergence rate that is stricter than the one achieved by unbiased stochastic rounding. The theoretical analysis is validated by comparing the performances of various rounding strategies when optimizing several examples using low-precision fixed-point arithmetic.

神经网络训练低精度计算随机舍入策略优化收敛性

阅读原文 ↗