Katyusha: The First Truly Accelerated Stochastic Gradient Method

18 March 2016

Abstract

We introduce $\mathtt{Katyusha}$ , the first direct stochastic gradient method that has an accelerated convergence rate. Given an objective that is an average of $n$ convex and smooth functions, $\mathtt{Katyusha}$ converges to an $\varepsilon$ -approximate minimizer using $O((n + \sqrt{n \kappa})\cdot \log\frac{f(x_0)-f(x^*)}{\varepsilon})$ stochastic iterations, where $\kappa$ is the condition number. $\mathtt{Katyusha}$ also resolves the following open questions in optimization and machine learning $\bullet$ For weakly convex and smooth objectives (e.g., Lasso, Logistic Regression), $\mathtt{Katyusha}$ is the first stochastic method that achieves the optimal $1/\sqrt{\varepsilon}$ rate. $\bullet$ For strongly-convex but non-smooth ERM objectives (e.g., SVM), $\mathtt{Katyusha}$ gives the first stochastic method that achieves the optimal $1/\sqrt{\varepsilon}$ rate. $\bullet$ For weakly convex and non-smooth ERM objectives (e.g., L1SVM), $\mathtt{Katyusha}$ gives the first stochastic method that achieves the optimal $1/\varepsilon$ rate.

View on arXiv

Comments on this paper