Efficient Stochastic Gradient Descent for Strongly Convex Optimization

We motivate this study from a recent work on a stochastic gradient descent (SGD) method with only one projection \citep{DBLP:conf/nips/MahdaviYJZY12}, which aims at alleviating the computational bottleneck of the standard SGD method in performing the projection at each iteration, and enjoys an convergence rate for strongly convex optimization. In this paper, we make further contributions along the line. First, we develop an epoch-projection SGD method that only makes a constant number of projections less than but achieves an optimal convergence rate for {\it strongly convex optimization}. Second, we present a proximal extension to utilize the structure of the objective function that could further speed-up the computation and convergence for sparse regularized loss minimization problems. Finally, we consider an application of the proposed techniques to solving the high dimensional large margin nearest neighbor classification problem, yielding a speed-up of orders of magnitude.
View on arXiv