Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04740
Cited By
The Heavy-Tail Phenomenon in SGD
8 June 2020
Mert Gurbuzbalaban
Umut Simsekli
Lingjiong Zhu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Heavy-Tail Phenomenon in SGD"
35 / 35 papers shown
Title
Understanding the Generalization Error of Markov algorithms through Poissonization
Benjamin Dupuis
Maxime Haddouche
George Deligiannidis
Umut Simsekli
49
0
0
11 Feb 2025
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Aleksandar Armacki
Shuhua Yu
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
59
2
0
17 Oct 2024
Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates
Puning Zhao
Xiaogang Xu
Zhe Liu
Chong Wang
Rongfei Fan
Qingming Li
50
1
0
19 Aug 2024
Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
Benjamin Dupuis
Paul Viallard
George Deligiannidis
Umut Simsekli
48
2
0
26 Apr 2024
SGD with Clipping is Secretly Estimating the Median Gradient
Fabian Schaipp
Guillaume Garrigos
Umut Simsekli
Robert M. Gower
31
0
0
20 Feb 2024
From Mutual Information to Expected Dynamics: New Generalization Bounds for Heavy-Tailed SGD
Benjamin Dupuis
Paul Viallard
18
3
0
01 Dec 2023
Provably Robust Temporal Difference Learning for Heavy-Tailed Rewards
Semih Cayci
A. Eryilmaz
23
2
0
20 Jun 2023
A Heavy-Tailed Algebra for Probabilistic Programming
Feynman T. Liang
Liam Hodgkinson
Michael W. Mahoney
23
3
0
15 Jun 2023
Understanding the Generalization Ability of Deep Learning Algorithms: A Kernelized Renyi's Entropy Perspective
Yuxin Dong
Tieliang Gong
Hao Chen
Chen Li
23
4
0
02 May 2023
Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent
Liu Ziyin
Botao Li
Tomer Galanti
Masakuni Ueda
37
7
0
23 Mar 2023
Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize
Mert Gurbuzbalaban
Yuanhan Hu
Umut Simsekli
Lingjiong Zhu
LRM
23
1
0
10 Feb 2023
Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions
Anant Raj
Lingjiong Zhu
Mert Gurbuzbalaban
Umut Simsekli
34
15
0
27 Jan 2023
On the Overlooked Structure of Stochastic Gradients
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
33
6
0
05 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
35
10
0
19 Nov 2022
Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning
Haibo Yang
Pei-Yuan Qiu
Jia Liu
FedML
37
12
0
03 Oct 2022
Algorithmic Stability of Heavy-Tailed Stochastic Gradient Descent on Least Squares
Anant Raj
Melih Barsbey
Mert Gurbuzbalaban
Lingjiong Zhu
Umut Simsekli
19
9
0
02 Jun 2022
Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility
Hoileong Lee
Fadhel Ayed
Paul Jung
Juho Lee
Hongseok Yang
François Caron
48
10
0
17 May 2022
Heavy-Tail Phenomenon in Decentralized SGD
Mert Gurbuzbalaban
Yuanhan Hu
Umut Simsekli
Kun Yuan
Lingjiong Zhu
38
8
0
13 May 2022
An Empirical Study of the Occurrence of Heavy-Tails in Training a ReLU Gate
Sayar Karmakar
Anirbit Mukherjee
21
0
0
26 Apr 2022
Nonlinear gradient mappings and stochastic optimization: A general framework with applications to heavy-tail noise
D. Jakovetić
Dragana Bajović
Anit Kumar Sahu
S. Kar
Nemanja Milošević
Dusan Stamenkovic
19
12
0
06 Apr 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
30
2
0
21 Mar 2022
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurelien Lucchi
76
44
0
06 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
9
0
31 Jan 2022
Impact of classification difficulty on the weight matrices spectra in Deep Learning and application to early-stopping
Xuran Meng
Jianfeng Yao
29
7
0
26 Nov 2021
Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks
Tolga Birdal
Aaron Lou
Leonidas J. Guibas
Umut cSimcsekli
32
61
0
25 Nov 2021
A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning
Sulaiman A. Alghunaim
Kun Yuan
42
57
0
19 Oct 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
38
15
0
15 Jun 2021
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms
A. Camuto
George Deligiannidis
Murat A. Erdogdu
Mert Gurbuzbalaban
Umut cSimcsekli
Lingjiong Zhu
33
29
0
09 Jun 2021
Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance
Hongjian Wang
Mert Gurbuzbalaban
Lingjiong Zhu
Umut cSimcsekli
Murat A. Erdogdu
21
41
0
20 Feb 2021
Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes
Steffen Dereich
Sebastian Kassing
34
27
0
16 Feb 2021
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Umut Simsekli
Ozan Sener
George Deligiannidis
Murat A. Erdogdu
44
55
0
16 Jun 2020
Sharp Concentration Results for Heavy-Tailed Distributions
Milad Bakhshizadeh
A. Maleki
Víctor Pena
17
23
0
30 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
236
0
04 Mar 2020
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
47
192
0
02 Oct 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
310
2,896
0
15 Sep 2016
1