ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.11303
  4. Cited By
Large Learning Rates Improve Generalization: But How Large Are We
  Talking About?

Large Learning Rates Improve Generalization: But How Large Are We Talking About?

19 November 2023
E. Lobacheva
Eduard Pockonechnyy
M. Kodryan
Dmitry Vetrov
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "Large Learning Rates Improve Generalization: But How Large Are We Talking About?"

7 / 7 papers shown
Title
Training Scale-Invariant Neural Networks on the Sphere Can Happen in
  Three Regimes
Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes
M. Kodryan
E. Lobacheva
M. Nakhodnov
Dmitry Vetrov
77
17
0
08 Sep 2022
The Role of Permutation Invariance in Linear Mode Connectivity of Neural
  Networks
The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks
R. Entezari
Hanie Sedghi
O. Saukh
Behnam Neyshabur
MoMe
94
237
0
12 Oct 2021
On the Periodic Behavior of Neural Network Training with Batch
  Normalization and Weight Decay
On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
E. Lobacheva
M. Kodryan
Nadezhda Chirkova
A. Malinin
Dmitry Vetrov
71
26
0
29 Jun 2021
On the Origin of Implicit Regularization in Stochastic Gradient Descent
On the Origin of Implicit Regularization in Stochastic Gradient Descent
Samuel L. Smith
Benoit Dherin
David Barrett
Soham De
MLT
47
204
0
28 Jan 2021
Reconciling Modern Deep Learning with Traditional Optimization Analyses:
  The Intrinsic Learning Rate
Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate
Zhiyuan Li
Kaifeng Lyu
Sanjeev Arora
94
75
0
06 Oct 2020
Implicit Gradient Regularization
Implicit Gradient Regularization
David Barrett
Benoit Dherin
76
152
0
23 Sep 2020
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate
  Schedule
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule
Nikhil Iyer
V. Thejas
Nipun Kwatra
Ramachandran Ramjee
Muthian Sivathanu
37
29
0
09 Mar 2020
1