411
v1v2v3 (latest)

Improved Learning Rates for Stochastic Optimization

Main:54 Pages
Bibliography:9 Pages
Abstract

Stochastic optimization is a cornerstone of modern machine learning. This paper studies the generalization performance of two classical stochastic optimization algorithms: stochastic gradient descent (SGD) and Nesterov's accelerated gradient (NAG). We establish new learning rates for both algorithms, with improved guarantees in some settings or comparable rates under weaker assumptions in others. We also provide numerical experiments to support the theory.

View on arXiv
Comments on this paper