Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition

23 August 2023

Abstract

In this work, we establish the linear convergence estimate for the gradient descent involving the delay $\tau\in\mathbb{N}$ when the cost function is $\mu$ -strongly convex and $L$ -smooth. This result improves upon the well-known estimates in Arjevani et al. \cite{ASS} and Stich-Karmireddy \cite{SK} in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate $\eta$ can be extended from $\eta\leq 1/(10L\tau)$ to $\eta\leq 1/(4L\tau)$ for $\tau =1$ and $\eta\leq 3/(10L\tau)$ for $\tau \geq 2$ , where $L >0$ is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-{\L}ojasiewicz\,(PL) condition, for which the available choice of learning rate is further improved as $\eta\leq 9/(10L\tau)$ for the large delay $\tau$ . Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results.

View on arXiv

Comments on this paper