SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation

18 June 2020

Papers citing "SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation"

12 / 12 papers shown

Title
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance Dimitris Oikonomou Nicolas Loizou 59 5 0 06 Jun 2024
Demystifying SGD with Doubly Stochastic Gradients Kyurae Kim Joohwan Ko Yian Ma Jacob R. Gardner 68 1 0 03 Jun 2024
Better Theory for SGD in the Nonconvex World Ahmed Khaled Peter Richtárik 26 182 0 09 Feb 2020
Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval Yan Shuo Tan Roman Vershynin 39 35 0 28 Oct 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates Sharan Vaswani Aaron Mishkin I. Laradji Mark Schmidt Gauthier Gidel Simon Lacoste-Julien ODL 60 207 0 24 May 2019
Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions Yunwen Lei Ting Hu Guiying Li K. Tang MLT 31 116 0 03 Feb 2019
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron Sharan Vaswani Francis R. Bach Mark Schmidt 40 297 0 16 Oct 2018
Momentum and Stochastic Momentum for Stochastic Gradient, Newton, Proximal Point and Subspace Descent Methods Nicolas Loizou Peter Richtárik 51 200 0 27 Dec 2017
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning Siyuan Ma Raef Bassily M. Belkin 39 289 0 18 Dec 2017
Calculus of the exponent of Kurdyka-Łojasiewicz inequality and its applications to linear convergence of first-order methods Guoyin Li Ting Kei Pong 128 293 0 09 Feb 2016
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming Saeed Ghadimi Guanghui Lan ODL 34 1,538 0 22 Sep 2013
Minimizing Finite Sums with the Stochastic Average Gradient Mark Schmidt Nicolas Le Roux Francis R. Bach 194 1,245 0 10 Sep 2013