ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.06766
  4. Cited By
Convergence guarantees for RMSProp and ADAM in non-convex optimization
  and an empirical comparison to Nesterov acceleration

Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration

18 July 2018
Soham De
Anirbit Mukherjee
Enayat Ullah
ArXivPDFHTML

Papers citing "Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration"

33 / 33 papers shown
Title
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
66
1
0
12 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
99
7
0
01 Apr 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
101
13
0
06 Feb 2024
On the Convergence of Adam and Beyond
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
93
2,499
0
19 Apr 2019
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex
  Optimization
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
Xiangyi Chen
Sijia Liu
Ruoyu Sun
Mingyi Hong
55
323
0
08 Aug 2018
On the Convergence of Stochastic Gradient Descent with Adaptive
  Stepsizes
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
Xiaoyun Li
Francesco Orabona
69
295
0
21 May 2018
Aggregated Momentum: Stability Through Passive Damping
Aggregated Momentum: Stability Through Passive Damping
James Lucas
Shengyang Sun
R. Zemel
Roger C. Grosse
59
68
0
01 Apr 2018
On the insufficiency of existing momentum schemes for Stochastic
  Optimization
On the insufficiency of existing momentum schemes for Stochastic Optimization
Rahul Kidambi
Praneeth Netrapalli
Prateek Jain
Sham Kakade
ODL
73
119
0
15 Mar 2018
signSGD: Compressed Optimisation for Non-Convex Problems
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
96
1,043
0
13 Feb 2018
Momentum and Stochastic Momentum for Stochastic Gradient, Newton,
  Proximal Point and Subspace Descent Methods
Momentum and Stochastic Momentum for Stochastic Gradient, Newton, Proximal Point and Subspace Descent Methods
Nicolas Loizou
Peter Richtárik
73
201
0
27 Dec 2017
Improving Generalization Performance by Switching from Adam to SGD
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
98
523
0
20 Dec 2017
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient
  Descent
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Chi Jin
Praneeth Netrapalli
Michael I. Jordan
ODL
69
262
0
28 Nov 2017
Sparse Coding and Autoencoders
Sparse Coding and Autoencoders
Akshay Rangamani
Anirbit Mukherjee
A. Basu
T. Ganapathi
Ashish Arora
S. Chin
T. Tran
68
20
0
12 Aug 2017
Training Deep AutoEncoders for Collaborative Filtering
Training Deep AutoEncoders for Collaborative Filtering
Oleksii Kuchaiev
Boris Ginsburg
65
80
0
05 Aug 2017
On the State of the Art of Evaluation in Neural Language Models
On the State of the Art of Evaluation in Neural Language Models
Gábor Melis
Chris Dyer
Phil Blunsom
65
535
0
18 Jul 2017
Stronger Baselines for Trustable Results in Neural Machine Translation
Stronger Baselines for Trustable Results in Neural Machine Translation
Michael J. Denkowski
Graham Neubig
57
118
0
29 Jun 2017
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
Mahesh Chandra Mukkamala
Matthias Hein
ODL
54
258
0
17 Jun 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
65
1,030
0
23 May 2017
Stochastic Heavy Ball
Stochastic Heavy Ball
S. Gadat
Fabien Panloup
Sofiane Saadane
96
104
0
14 Sep 2016
TensorFlow: A system for large-scale machine learning
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNN
AI4CE
433
18,361
0
27 May 2016
Unified Convergence Analysis of Stochastic Momentum Methods for Convex
  and Non-convex Optimization
Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization
Tianbao Yang
Qihang Lin
Zhe Li
61
122
0
12 Apr 2016
On the Influence of Momentum Acceleration on Online Learning
On the Influence of Momentum Acceleration on Online Learning
Kun Yuan
Bicheng Ying
Ali H. Sayed
62
58
0
14 Mar 2016
Unsupervised Representation Learning with Deep Convolutional Generative
  Adversarial Networks
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Alec Radford
Luke Metz
Soumith Chintala
GAN
OOD
253
14,012
0
19 Nov 2015
Stop Wasting My Gradients: Practical SVRG
Stop Wasting My Gradients: Practical SVRG
Reza Babanezhad
Mohamed Osama Ahmed
Alim Virani
Mark Schmidt
Jakub Konecný
Scott Sallinen
62
134
0
05 Nov 2015
Skip-Thought Vectors
Skip-Thought Vectors
Ryan Kiros
Yukun Zhu
Ruslan Salakhutdinov
R. Zemel
Antonio Torralba
R. Urtasun
Sanja Fidler
SSL
214
2,411
0
22 Jun 2015
Why Regularized Auto-Encoders learn Sparse Representation?
Why Regularized Auto-Encoders learn Sparse Representation?
Devansh Arpit
Yingbo Zhou
H. Ngo
V. Govindaraju
62
68
0
21 May 2015
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
James Martens
Roger C. Grosse
ODL
101
1,013
0
19 Mar 2015
DRAW: A Recurrent Neural Network For Image Generation
DRAW: A Recurrent Neural Network For Image Generation
Karol Gregor
Ivo Danihelka
Alex Graves
Danilo Jimenez Rezende
Daan Wierstra
GAN
DRL
165
1,962
0
16 Feb 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,305
0
11 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
346
10,069
0
10 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,115
0
22 Dec 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,348
0
04 Sep 2014
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly
  Convex Composite Objectives
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives
Aaron Defazio
Francis R. Bach
Simon Lacoste-Julien
ODL
133
1,822
0
01 Jul 2014
1