Accelerating SGD with momentum for over-parameterized learning

v1v2v3v4v5 (latest)

Accelerating SGD with momentum for over-parameterized learning

31 October 2018

ArXiv (abs)PDF HTML

Papers citing "Accelerating SGD with momentum for over-parameterized learning"

9 / 9 papers shown

Title
On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent Noah Golmant N. Vemuri Z. Yao Vladimir Feinberg A. Gholami Kai Rothauge Michael W. Mahoney Joseph E. Gonzalez 75 73 0 30 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron Sharan Vaswani Francis R. Bach Mark Schmidt 80 298 0 16 Oct 2018
On the insufficiency of existing momentum schemes for Stochastic Optimization Rahul Kidambi Praneeth Netrapalli Prateek Jain Sham Kakade ODL 83 119 0 15 Mar 2018
To understand deep learning we need to understand kernel learning M. Belkin Siyuan Ma Soumik Mandal 65 419 0 05 Feb 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning Siyuan Ma Raef Bassily M. Belkin 79 289 0 18 Dec 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Priya Goyal Piotr Dollár Ross B. Girshick P. Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia Kaiming He 3DH 128 3,685 0 08 Jun 2017
Understanding deep learning requires rethinking generalization Chiyuan Zhang Samy Bengio Moritz Hardt Benjamin Recht Oriol Vinyals HAI 345 4,629 0 10 Nov 2016
An Analysis of Deep Neural Network Models for Practical Applications A. Canziani Adam Paszke Eugenio Culurciello 88 1,168 0 24 May 2016
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning Behnam Neyshabur Ryota Tomioka Nathan Srebro AI4CE 94 660 0 20 Dec 2014