Sharpness-Aware Minimization and the Edge of Stability

v1v2v3v4v5v6 (latest)

Sharpness-Aware Minimization and the Edge of Stability

21 September 2023

Peter L. Bartlett

ArXiv (abs)PDF HTML

Papers citing "Sharpness-Aware Minimization and the Edge of Stability"

19 / 19 papers shown

Title
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training Zhanpeng Zhou Mingze Wang Yuchen Mao Bingrui Li Junchi Yan AAML 95 1 0 14 Oct 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 108 7 1 25 May 2024
Sharpness-Aware Minimization Leads to Low-Rank Features Maksym Andriushchenko Dara Bahri H. Mobahi Nicolas Flammarion AAML 100 25 0 25 May 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization Yan Dai Kwangjun Ahn S. Sra 106 19 0 24 May 2023
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon Atish Agarwala Yann N. Dauphin 54 21 0 17 Feb 2023
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 82 41 0 14 Dec 2022
Second-order regression models exhibit progressive sharpening to the edge of stability Atish Agarwala Fabian Pedregosa Jeffrey Pennington 91 28 0 10 Oct 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example Xingyu Zhu Zixuan Wang Xiang Wang Mo Zhou Rong Ge 113 39 0 07 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima Peter L. Bartlett Philip M. Long Olivier Bousquet 143 37 0 04 Oct 2022
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability Alexandru Damian Eshaan Nichani Jason D. Lee 92 88 0 30 Sep 2022
Adaptive Gradient Methods at the Edge of Stability Jeremy M. Cohen Behrooz Ghorbani Shankar Krishnan Naman Agarwal Sourabh Medapati ... Daniel Suo David E. Cardoze Zachary Nado George E. Dahl Justin Gilmer ODL 96 54 0 29 Jul 2022
Towards Understanding Sharpness-Aware Minimization Maksym Andriushchenko Nicolas Flammarion AAML 96 142 0 13 Jun 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning Sanjeev Arora Zhiyuan Li A. Panigrahi MLT 110 99 0 19 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes Chao Ma D. Kunin Lei Wu Lexing Ying 69 29 0 24 Apr 2022
Understanding the unstable convergence of gradient descent Kwangjun Ahn J.N. Zhang S. Sra 85 63 0 03 Apr 2022
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability Jeremy M. Cohen Simran Kaur Yuanzhi Li J. Zico Kolter Ameet Talwalkar ODL 104 277 0 26 Feb 2021
Sharpness-Aware Minimization for Efficiently Improving Generalization Pierre Foret Ariel Kleiner H. Mobahi Behnam Neyshabur AAML 199 1,358 0 03 Oct 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks Stanislaw Jastrzebski Maciej Szymczak Stanislav Fort Devansh Arpit Jacek Tabor Kyunghyun Cho Krzysztof J. Geras 85 164 0 21 Feb 2020
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density Behrooz Ghorbani Shankar Krishnan Ying Xiao ODL 86 326 0 29 Jan 2019