随机次梯度下降在弱凸函数上逃离活跃严格鞍点

Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions

Mathematics of Operations Research · 2023
被引 4
ABS 3

中文导读

研究了随机次梯度下降(SGD)在非光滑随机优化中不会收敛到活跃严格鞍点,这类点位于一个流形上且函数具有二阶负曲率方向,在弱凸函数类中该结论具有普遍性。

Abstract

In nonsmooth stochastic optimization, we establish the nonconvergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold M, where the function f has a direction of second-order negative curvature. Off this manifold, the norm of the Clarke subdifferential of f is lower-bounded. We require two conditions on f. The first assumption is a Verdier stratification condition, which is a refinement of the popular Whitney stratification. It allows us to establish a strengthened version of the projection formula of Bolte et al. for Whitney stratifiable functions and which is of independent interest. The second assumption, termed the angle condition, allows us to control the distance of the iterates to M. When f is weakly convex, our assumptions are generic. Consequently, generically, in the class of definable weakly convex functions, SGD converges to a local minimizer. Funding: The work of Sholom Schechtman was supported by “Région Ile-de-France”.

非光滑优化随机优化弱凸函数鞍点