Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.03294
Cited By
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
7 October 2022
Xingyu Zhu
Zixuan Wang
Xiang Wang
Mo Zhou
Rong Ge
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Understanding Edge-of-Stability Training Dynamics with a Minimalist Example"
16 / 16 papers shown
Title
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
56
2
0
05 Apr 2025
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Dayal Singh Kalra
Tianyu He
M. Barkeshli
105
6
0
17 Feb 2025
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
95
6
1
25 May 2024
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
41
46
0
26 Jul 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
67
73
0
14 Jun 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
88
97
0
19 May 2022
Understanding the unstable convergence of gradient descent
Kwangjun Ahn
J.N. Zhang
S. Sra
67
60
0
03 Apr 2022
Robust Training of Neural Networks Using Scale Invariant Architectures
Zhiyuan Li
Srinadh Bhojanapalli
Manzil Zaheer
Sashank J. Reddi
Surinder Kumar
53
28
0
02 Feb 2022
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
80
40
0
07 Oct 2021
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Jeremy M. Cohen
Simran Kaur
Yuanzhi Li
J. Zico Kolter
Ameet Talwalkar
ODL
73
267
0
26 Feb 2021
Tilting the playing field: Dynamical loss functions for machine learning
M. Ruíz-García
Ge Zhang
S. Schoenholz
Andrea J. Liu
92
11
0
07 Feb 2021
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
182
241
0
04 Mar 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
76
161
0
21 Feb 2020
PyHessian: Neural Networks Through the Lens of the Hessian
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
51
302
0
16 Dec 2019
Theoretical Analysis of Auto Rate-Tuning by Batch Normalization
Sanjeev Arora
Zhiyuan Li
Kaifeng Lyu
72
131
0
10 Dec 2018
A Walk with SGD
Chen Xing
Devansh Arpit
Christos Tsirigotis
Yoshua Bengio
85
119
0
24 Feb 2018
1