144
582

Katyusha: The First Truly Accelerated Stochastic Gradient Method

Abstract

We introduce Katyusha\mathtt{Katyusha}, the first direct stochastic gradient method that has an accelerated convergence rate. Given an objective that is an average of nn convex and smooth functions, Katyusha\mathtt{Katyusha} converges to an ε\varepsilon-approximate minimizer using O((n+nκ)logf(x0)f(x)ε)O((n + \sqrt{n \kappa})\cdot \log\frac{f(x_0)-f(x^*)}{\varepsilon}) stochastic iterations, where κ\kappa is the condition number. Katyusha\mathtt{Katyusha} also resolves the following open questions in optimization and machine learning \bullet For weakly convex and smooth objectives (e.g., Lasso, Logistic Regression), Katyusha\mathtt{Katyusha} is the first stochastic method that achieves the optimal 1/ε1/\sqrt{\varepsilon} rate. \bullet For strongly-convex but non-smooth ERM objectives (e.g., SVM), Katyusha\mathtt{Katyusha} gives the first stochastic method that achieves the optimal 1/ε1/\sqrt{\varepsilon} rate. \bullet For weakly convex and non-smooth ERM objectives (e.g., L1SVM), Katyusha\mathtt{Katyusha} gives the first stochastic method that achieves the optimal 1/ε1/\varepsilon rate.

View on arXiv
Comments on this paper