ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.06189
  4. Cited By
Stochasticity of Deterministic Gradient Descent: Large Learning Rate for
  Multiscale Objective Function

Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function

14 February 2020
Lingkai Kong
Molei Tao
ArXivPDFHTML

Papers citing "Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function"

13 / 13 papers shown
Title
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
33
0
0
05 Apr 2025
The boundary of neural network trainability is fractal
The boundary of neural network trainability is fractal
Jascha Narain Sohl-Dickstein
36
8
0
09 Feb 2024
From Stability to Chaos: Analyzing Gradient Descent Dynamics in
  Quadratic Regression
From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression
Xuxing Chen
Krishnakumar Balasubramanian
Promit Ghosal
Bhavya Agrawalla
38
7
0
02 Oct 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
  Stability
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
32
17
0
19 May 2023
On the generalization of learning algorithms that do not converge
On the generalization of learning algorithms that do not converge
N. Chandramoorthy
Andreas Loukas
Khashayar Gatmiry
Stefanie Jegelka
MLT
19
11
0
16 Aug 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
45
71
0
14 Jun 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural
  Network Loss Landscapes
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
Gradients are Not All You Need
Gradients are Not All You Need
Luke Metz
C. Freeman
S. Schoenholz
Tal Kachman
30
93
0
10 Nov 2021
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang
Minshuo Chen
T. Zhao
Molei Tao
AI4CE
57
40
0
07 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
The large learning rate phase of deep learning: the catapult mechanism
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
235
0
04 Mar 2020
A Differential Equation for Modeling Nesterov's Accelerated Gradient
  Method: Theory and Insights
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
108
1,157
0
04 Mar 2015
Norm-Based Capacity Control in Neural Networks
Norm-Based Capacity Control in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
127
577
0
27 Feb 2015
1