ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.02749
  4. Cited By
Revisiting the Noise Model of Stochastic Gradient Descent

Revisiting the Noise Model of Stochastic Gradient Descent

5 March 2023
Barak Battash
Ofir Lindenbaum
ArXivPDFHTML

Papers citing "Revisiting the Noise Model of Stochastic Gradient Descent"

30 / 30 papers shown
Title
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Aleksandar Armacki
Shuhua Yu
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
83
2
0
17 Oct 2024
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Siyuan Yu
Wei Chen
H. V. Poor
66
0
0
17 Jun 2024
Power-law escape rate of SGD
Power-law escape rate of SGD
Takashi Mori
Liu Ziyin
Kangqiao Liu
Masakuni Ueda
46
19
0
20 May 2021
Refined Least Squares for Support Recovery
Refined Least Squares for Support Recovery
Ofir Lindenbaum
Stefan Steinerberger
16
6
0
19 Mar 2021
On the Origin of Implicit Regularization in Stochastic Gradient Descent
On the Origin of Implicit Regularization in Stochastic Gradient Descent
Samuel L. Smith
Benoit Dherin
David Barrett
Soham De
MLT
34
203
0
28 Jan 2021
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM
  in Deep Learning
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou
Jiashi Feng
Chao Ma
Caiming Xiong
Guosheng Lin
E. Weinan
71
234
0
12 Oct 2020
On the Generalization Benefit of Noise in Stochastic Gradient Descent
On the Generalization Benefit of Noise in Stochastic Gradient Descent
Samuel L. Smith
Erich Elsen
Soham De
MLT
49
99
0
26 Jun 2020
Dynamic of Stochastic Gradient Descent with State-Dependent Noise
Dynamic of Stochastic Gradient Descent with State-Dependent Noise
Qi Meng
Shiqi Gong
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
35
16
0
24 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
152
95
0
15 Jun 2020
Randomly Aggregated Least Squares for Support Recovery
Randomly Aggregated Least Squares for Support Recovery
Ofir Lindenbaum
Stefan Steinerberger
FedML
18
11
0
16 Mar 2020
On the Noisy Gradient Descent that Generalizes as SGD
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
37
10
0
18 Jun 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Haowei He
Gao Huang
Yang Yuan
ODL
MLT
61
150
0
02 Feb 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural
  Networks
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
82
247
0
18 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.6K
94,729
0
11 Oct 2018
Neural Network Acceptability Judgments
Neural Network Acceptability Judgments
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
226
1,407
0
31 May 2018
Essentially No Barriers in Neural Network Energy Landscape
Essentially No Barriers in Neural Network Energy Landscape
Felix Dräxler
K. Veschgini
M. Salmhofer
Fred Hamprecht
MoMe
105
432
0
02 Mar 2018
An Alternative View: When Does SGD Escape Local Minima?
An Alternative View: When Does SGD Escape Local Minima?
Robert D. Kleinberg
Yuanzhi Li
Yang Yuan
MLT
67
317
0
17 Feb 2018
Visualizing the Loss Landscape of Neural Nets
Visualizing the Loss Landscape of Neural Nets
Hao Li
Zheng Xu
Gavin Taylor
Christoph Studer
Tom Goldstein
240
1,885
0
28 Dec 2017
Don't Decay the Learning Rate, Increase the Batch Size
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
97
994
0
01 Nov 2017
Stochastic gradient descent performs variational inference, converges to
  limit cycles for deep networks
Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks
Pratik Chaudhari
Stefano Soatto
MLT
65
304
0
30 Oct 2017
Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic
  Differential Equations for Markov Chain Monte Carlo
Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for Markov Chain Monte Carlo
Umut Simsekli
58
45
0
12 Jun 2017
The loss surface of deep and wide neural networks
The loss surface of deep and wide neural networks
Quynh N. Nguyen
Matthias Hein
ODL
148
284
0
26 Apr 2017
Stochastic Gradient Descent as Approximate Bayesian Inference
Stochastic Gradient Descent as Approximate Bayesian Inference
Stephan Mandt
Matthew D. Hoffman
David M. Blei
BDL
52
597
0
13 Apr 2017
A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics
A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics
Yuchen Zhang
Percy Liang
Moses Charikar
61
236
0
18 Feb 2017
Non-convex learning via Stochastic Gradient Langevin Dynamics: a
  nonasymptotic analysis
Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis
Maxim Raginsky
Alexander Rakhlin
Matus Telgarsky
70
521
0
13 Feb 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
415
2,935
0
15 Sep 2016
A Variational Analysis of Stochastic Gradient Algorithms
A Variational Analysis of Stochastic Gradient Algorithms
Stephan Mandt
Matthew D. Hoffman
David M. Blei
50
161
0
08 Feb 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.1K
193,426
0
10 Dec 2015
Stochastic modified equations and adaptive stochastic gradient
  algorithms
Stochastic modified equations and adaptive stochastic gradient algorithms
Qianxiao Li
Cheng Tai
E. Weinan
59
284
0
19 Nov 2015
Practical recommendations for gradient-based training of deep
  architectures
Practical recommendations for gradient-based training of deep architectures
Yoshua Bengio
3DH
ODL
185
2,195
0
24 Jun 2012
1