Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.01029
Cited By
On Suppressing Range of Adaptive Stepsizes of Adam to Improve Generalisation Performance
2 February 2023
Guoqiang Zhang
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On Suppressing Range of Adaptive Stepsizes of Adam to Improve Generalisation Performance"
7 / 7 papers shown
Title
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients
Juntang Zhuang
Tommy M. Tang
Yifan Ding
S. Tatikonda
Nicha Dvornek
X. Papademetris
James S. Duncan
ODL
139
510
0
15 Oct 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
227
58
0
09 Feb 2020
Why are Adaptive Methods Good for Attention Models?
J.N. Zhang
Sai Praneeth Karimireddy
Andreas Veit
Seungyeon Kim
Sashank J. Reddi
Surinder Kumar
S. Sra
90
80
0
06 Dec 2019
Adaptive Gradient Methods with Dynamic Bound of Learning Rate
Liangchen Luo
Yuanhao Xiong
Yan Liu
Xu Sun
ODL
74
602
0
26 Feb 2019
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
56
1,028
0
23 May 2017
Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients
Lukas Balles
Philipp Hennig
66
168
0
22 May 2017
Improved Training of Wasserstein GANs
Ishaan Gulrajani
Faruk Ahmed
Martín Arjovsky
Vincent Dumoulin
Aaron Courville
GAN
173
9,533
0
31 Mar 2017
1