ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.01952
  4. Cited By
On a continuous time model of gradient descent dynamics and instability
  in deep learning

On a continuous time model of gradient descent dynamics and instability in deep learning

3 February 2023
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
ArXivPDFHTML

Papers citing "On a continuous time model of gradient descent dynamics and instability in deep learning"

14 / 14 papers shown
Title
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Dayal Singh Kalra
Tianyu He
M. Barkeshli
93
4
0
17 Feb 2025
Understanding the unstable convergence of gradient descent
Understanding the unstable convergence of gradient descent
Kwangjun Ahn
J.N. Zhang
S. Sra
51
57
0
03 Apr 2022
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
124
75
0
29 Sep 2021
Implicit Regularization in ReLU Networks with the Square Loss
Implicit Regularization in ReLU Networks with the Square Loss
Gal Vardi
Ohad Shamir
32
51
0
09 Dec 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
171
1,323
0
03 Oct 2020
Implicit Gradient Regularization
Implicit Gradient Regularization
David Barrett
Benoit Dherin
58
149
0
23 Sep 2020
The Break-Even Point on Optimization Trajectories of Deep Neural
  Networks
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
66
157
0
21 Feb 2020
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
55
94
0
24 Jan 2019
Understanding the Acceleration Phenomenon via High-Resolution
  Differential Equations
Understanding the Acceleration Phenomenon via High-Resolution Differential Equations
Bin Shi
S. Du
Michael I. Jordan
Weijie J. Su
42
256
0
21 Oct 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
375
2,922
0
15 Sep 2016
SGDR: Stochastic Gradient Descent with Warm Restarts
SGDR: Stochastic Gradient Descent with Warm Restarts
I. Loshchilov
Frank Hutter
ODL
242
8,030
0
13 Aug 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units
  (ELUs)
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
242
5,502
0
23 Nov 2015
Exact solutions to the nonlinear dynamics of learning in deep linear
  neural networks
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
Andrew M. Saxe
James L. McClelland
Surya Ganguli
ODL
136
1,830
0
20 Dec 2013
Riemannian metrics for neural networks I: feedforward networks
Riemannian metrics for neural networks I: feedforward networks
Yann Ollivier
55
103
0
04 Mar 2013
1