Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.05685
Cited By
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise
13 February 2020
Umut Simsekli
Lingjiong Zhu
Yee Whye Teh
Mert Gurbuzbalaban
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise"
26 / 26 papers shown
Title
The Heavy-Tail Phenomenon in SGD
Mert Gurbuzbalaban
Umut Simsekli
Lingjiong Zhu
45
126
0
08 Jun 2020
Why are Adaptive Methods Good for Attention Models?
J.N. Zhang
Sai Praneeth Karimireddy
Andreas Veit
Seungyeon Kim
Sashank J. Reddi
Surinder Kumar
S. Sra
90
80
0
06 Dec 2019
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
Umut Simsekli
Mert Gurbuzbalaban
T. H. Nguyen
G. Richard
Levent Sagun
61
58
0
29 Nov 2019
Non-Gaussianity of Stochastic Gradient Noise
A. Panigrahi
Raghav Somani
Navin Goyal
Praneeth Netrapalli
58
52
0
21 Oct 2019
First Exit Time Analysis of Stochastic Gradient Descent Under Heavy-Tailed Gradient Noise
T. H. Nguyen
Umut Simsekli
Mert Gurbuzbalaban
G. Richard
47
64
0
21 Jun 2019
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
72
459
0
28 May 2019
Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks
Charles H. Martin
Michael W. Mahoney
42
56
0
24 Jan 2019
Non-Asymptotic Analysis of Fractional Langevin Monte Carlo for Non-Convex Optimization
T. H. Nguyen
Umut Simsekli
G. Richard
47
28
0
22 Jan 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
82
247
0
18 Jan 2019
Breaking Reversibility Accelerates Langevin Dynamics for Global Non-Convex Optimization
Xuefeng Gao
Mert Gurbuzbalaban
Lingjiong Zhu
51
31
0
19 Dec 2018
Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and Momentum-Based Acceleration
Xuefeng Gao
Mert Gurbuzbalaban
Lingjiong Zhu
49
60
0
12 Sep 2018
On sampling from a log-concave density using kinetic Langevin diffusions
A. Dalalyan
L. Riou-Durand
63
155
0
24 Jul 2018
Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization
Umut Simsekli
Çağatay Yıldız
T. H. Nguyen
G. Richard
A. Cemgil
36
22
0
07 Jun 2018
Stochastic Variance-Reduced Hamilton Monte Carlo Methods
Difan Zou
Pan Xu
Quanquan Gu
BDL
48
31
0
13 Feb 2018
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
76
463
0
13 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
97
994
0
01 Nov 2017
Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks
Pratik Chaudhari
Stefano Soatto
MLT
65
304
0
30 Oct 2017
Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization
Pan Xu
Jinghui Chen
Difan Zou
Quanquan Gu
68
205
0
20 Jul 2017
Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for Markov Chain Monte Carlo
Umut Simsekli
58
45
0
12 Jun 2017
Kinetic energy choice in Hamiltonian/hybrid Monte Carlo
Samuel Livingstone
Michael F Faulkner
Gareth O. Roberts
45
45
0
08 Jun 2017
Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis
Maxim Raginsky
Alexander Rakhlin
Matus Telgarsky
70
521
0
13 Feb 2017
Relativistic Monte Carlo
Xiaoyu Lu
Valerio Perrone
Leonard Hasenclever
Yee Whye Teh
Sebastian J. Vollmer
BDL
44
39
0
14 Sep 2016
Accurate and efficient numerical calculation of stable densities via optimized quadrature and asymptotics
Sebastian Ament
M. O’Neil
76
22
0
14 Jul 2016
A Variational Analysis of Stochastic Gradient Algorithms
Stephan Mandt
Matthew D. Hoffman
David M. Blei
47
161
0
08 Feb 2016
On the difficulty of training Recurrent Neural Networks
Razvan Pascanu
Tomas Mikolov
Yoshua Bengio
ODL
182
5,334
0
21 Nov 2012
MCMC using Hamiltonian dynamics
Radford M. Neal
290
3,276
0
09 Jun 2012
1