Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.06766
Cited By
Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration
18 July 2018
Soham De
Anirbit Mukherjee
Enayat Ullah
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration"
33 / 33 papers shown
Title
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
66
1
0
12 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
99
7
0
01 Apr 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
101
13
0
06 Feb 2024
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
93
2,499
0
19 Apr 2019
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
Xiangyi Chen
Sijia Liu
Ruoyu Sun
Mingyi Hong
55
323
0
08 Aug 2018
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
Xiaoyun Li
Francesco Orabona
69
295
0
21 May 2018
Aggregated Momentum: Stability Through Passive Damping
James Lucas
Shengyang Sun
R. Zemel
Roger C. Grosse
59
68
0
01 Apr 2018
On the insufficiency of existing momentum schemes for Stochastic Optimization
Rahul Kidambi
Praneeth Netrapalli
Prateek Jain
Sham Kakade
ODL
73
119
0
15 Mar 2018
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
96
1,043
0
13 Feb 2018
Momentum and Stochastic Momentum for Stochastic Gradient, Newton, Proximal Point and Subspace Descent Methods
Nicolas Loizou
Peter Richtárik
73
201
0
27 Dec 2017
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
96
523
0
20 Dec 2017
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Chi Jin
Praneeth Netrapalli
Michael I. Jordan
ODL
64
262
0
28 Nov 2017
Sparse Coding and Autoencoders
Akshay Rangamani
Anirbit Mukherjee
A. Basu
T. Ganapathi
Ashish Arora
S. Chin
T. Tran
65
20
0
12 Aug 2017
Training Deep AutoEncoders for Collaborative Filtering
Oleksii Kuchaiev
Boris Ginsburg
65
80
0
05 Aug 2017
On the State of the Art of Evaluation in Neural Language Models
Gábor Melis
Chris Dyer
Phil Blunsom
65
535
0
18 Jul 2017
Stronger Baselines for Trustable Results in Neural Machine Translation
Michael J. Denkowski
Graham Neubig
57
118
0
29 Jun 2017
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
Mahesh Chandra Mukkamala
Matthias Hein
ODL
54
258
0
17 Jun 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
65
1,030
0
23 May 2017
Stochastic Heavy Ball
S. Gadat
Fabien Panloup
Sofiane Saadane
93
103
0
14 Sep 2016
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNN
AI4CE
433
18,350
0
27 May 2016
Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization
Tianbao Yang
Qihang Lin
Zhe Li
61
122
0
12 Apr 2016
On the Influence of Momentum Acceleration on Online Learning
Kun Yuan
Bicheng Ying
Ali H. Sayed
62
58
0
14 Mar 2016
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Alec Radford
Luke Metz
Soumith Chintala
GAN
OOD
253
14,008
0
19 Nov 2015
Stop Wasting My Gradients: Practical SVRG
Reza Babanezhad
Mohamed Osama Ahmed
Alim Virani
Mark Schmidt
Jakub Konecný
Scott Sallinen
62
134
0
05 Nov 2015
Skip-Thought Vectors
Ryan Kiros
Yukun Zhu
Ruslan Salakhutdinov
R. Zemel
Antonio Torralba
R. Urtasun
Sanja Fidler
SSL
214
2,411
0
22 Jun 2015
Why Regularized Auto-Encoders learn Sparse Representation?
Devansh Arpit
Yingbo Zhou
H. Ngo
V. Govindaraju
62
68
0
21 May 2015
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
James Martens
Roger C. Grosse
ODL
101
1,013
0
19 Mar 2015
DRAW: A Recurrent Neural Network For Image Generation
Karol Gregor
Ivo Danihelka
Alex Graves
Danilo Jimenez Rezende
Daan Wierstra
GAN
DRL
163
1,962
0
16 Feb 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,289
0
11 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
346
10,069
0
10 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,039
0
22 Dec 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,348
0
04 Sep 2014
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives
Aaron Defazio
Francis R. Bach
Simon Lacoste-Julien
ODL
133
1,822
0
01 Jul 2014
1