On a continuous time model of gradient descent dynamics and instability
in deep learning

On a continuous time model of gradient descent dynamics and instability in deep learning

3 February 2023

Papers citing "On a continuous time model of gradient descent dynamics and instability in deep learning"

14 / 14 papers shown

Title
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos Dayal Singh Kalra Tianyu He M. Barkeshli 93 4 0 17 Feb 2025
Understanding the unstable convergence of gradient descent Kwangjun Ahn J.N. Zhang S. Sra 51 57 0 03 Apr 2022
Stochastic Training is Not Necessary for Generalization Jonas Geiping Micah Goldblum Phillip E. Pope Michael Moeller Tom Goldstein 124 75 0 29 Sep 2021
Implicit Regularization in ReLU Networks with the Square Loss Gal Vardi Ohad Shamir 32 51 0 09 Dec 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization Pierre Foret Ariel Kleiner H. Mobahi Behnam Neyshabur AAML 171 1,323 0 03 Oct 2020
Implicit Gradient Regularization David Barrett Benoit Dherin 58 149 0 23 Sep 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks Stanislaw Jastrzebski Maciej Szymczak Stanislav Fort Devansh Arpit Jacek Tabor Kyunghyun Cho Krzysztof J. Geras 66 157 0 21 Feb 2020
Width Provably Matters in Optimization for Deep Linear Neural Networks S. Du Wei Hu 55 94 0 24 Jan 2019
Understanding the Acceleration Phenomenon via High-Resolution Differential Equations Bin Shi S. Du Michael I. Jordan Weijie J. Su 42 256 0 21 Oct 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 375 2,922 0 15 Sep 2016
SGDR: Stochastic Gradient Descent with Warm Restarts I. Loshchilov Frank Hutter ODL 242 8,030 0 13 Aug 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) Djork-Arné Clevert Thomas Unterthiner Sepp Hochreiter 242 5,502 0 23 Nov 2015
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe James L. McClelland Surya Ganguli ODL 136 1,830 0 20 Dec 2013
Riemannian metrics for neural networks I: feedforward networks Yann Ollivier 55 103 0 04 Mar 2013