Large Learning Rates Improve Generalization: But How Large Are We Talking About?

19 November 2023

Papers citing "Large Learning Rates Improve Generalization: But How Large Are We Talking About?"

7 / 7 papers shown

Title
Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes M. Kodryan E. Lobacheva M. Nakhodnov Dmitry Vetrov 77 17 0 08 Sep 2022
The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks R. Entezari Hanie Sedghi O. Saukh Behnam Neyshabur MoMe 94 237 0 12 Oct 2021
On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay E. Lobacheva M. Kodryan Nadezhda Chirkova A. Malinin Dmitry Vetrov 71 26 0 29 Jun 2021
On the Origin of Implicit Regularization in Stochastic Gradient Descent Samuel L. Smith Benoit Dherin David Barrett Soham De MLT 47 204 0 28 Jan 2021
Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate Zhiyuan Li Kaifeng Lyu Sanjeev Arora 94 75 0 06 Oct 2020
Implicit Gradient Regularization David Barrett Benoit Dherin 76 152 0 23 Sep 2020
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule Nikhil Iyer V. Thejas Nipun Kwatra Ramachandran Ramjee Muthian Sivathanu 37 29 0 09 Mar 2020