58
9

Efficient Stochastic Gradient Descent for Strongly Convex Optimization

Abstract

We motivate this study from a recent work on a stochastic gradient descent (SGD) method with only one projection \citep{DBLP:conf/nips/MahdaviYJZY12}, which aims at alleviating the computational bottleneck of the standard SGD method in performing the projection at each iteration, and enjoys an O(logT/T)O(\log T/T) convergence rate for strongly convex optimization. In this paper, we make further contributions along the line. First, we develop an epoch-projection SGD method that only makes a constant number of projections less than log2T\log_2T but achieves an optimal convergence rate O(1/T)O(1/T) for {\it strongly convex optimization}. Second, we present a proximal extension to utilize the structure of the objective function that could further speed-up the computation and convergence for sparse regularized loss minimization problems. Finally, we consider an application of the proposed techniques to solving the high dimensional large margin nearest neighbor classification problem, yielding a speed-up of orders of magnitude.

View on arXiv
Comments on this paper