An Stochastic Differential Equation Perspective on Stochastic Convex Optimization
本文用随机微分方程分析带噪声梯度的凸优化问题,证明了目标函数和轨迹几乎必然收敛到最小值,并给出了凸、强凸和Łojasiewicz情形下的收敛速度。
In this paper, we analyze the global and local behavior of gradient-like flows under stochastic errors toward the aim of solving convex optimization problems with noisy gradient input. We first study the unconstrained differentiable convex case, using a stochastic differential equation where the drift term is minus the gradient of the objective function and the diffusion term is either bounded or square-integrable. In this context, under Lipschitz continuity of the gradient, our first main result shows almost sure convergence of the objective and the trajectory process toward a minimizer of the objective function. We also provide a comprehensive complexity analysis by establishing several new pointwise and ergodic convergence rates in expectation for the convex, strongly convex, and (local) Łojasiewicz case. The last involves a challenging local analysis which requires nontrivial arguments from measure theory. Then, we extend our study to the constrained case and more generally to nonsmooth problems. We show that several of our results have natural extensions obtained by replacing the gradient of the objective function by a cocoercive monotone operator. This makes it possible to obtain similar convergence results for optimization problems with an additively “smooth + nonsmooth” convex structure. Finally, we consider another extension of our results to nonsmooth optimization which is based on the Moreau envelope. Funding: This work was supported by Agence Nationale de la Recherche (ANR) [Grant ANR-20-CE92-0037-01].