ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.00092
  4. Cited By
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective

Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective

31 May 2021
Kushal Chakrabarti
Nikhil Chopra
    ODL
    AI4CE
ArXivPDFHTML

Papers citing "Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective"

18 / 18 papers shown
Title
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed
  Gradients
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients
Juntang Zhuang
Tommy M. Tang
Yifan Ding
S. Tatikonda
Nicha Dvornek
X. Papademetris
James S. Duncan
ODL
162
514
0
15 Oct 2020
Iterative Pre-Conditioning for Expediting the Gradient-Descent Method:
  The Distributed Linear Least-Squares Problem
Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem
Kushal Chakrabarti
Nirupam Gupta
Nikhil Chopra
20
11
0
06 Aug 2020
A Simple Convergence Proof of Adam and Adagrad
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
109
155
0
05 Mar 2020
On the Convergence of Adam and Beyond
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
93
2,499
0
19 Apr 2019
Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex
  Stochastic Optimization
Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization
Anas Barakat
Pascal Bianchi
48
76
0
04 Oct 2018
AdaShift: Decorrelation and Convergence of Adaptive Learning Rate
  Methods
AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods
Zhiming Zhou
Qingru Zhang
Guansong Lu
Hongwei Wang
Weinan Zhang
Yong Yu
49
66
0
29 Sep 2018
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex
  Optimization
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
Xiangyi Chen
Sijia Liu
Ruoyu Sun
Mingyi Hong
55
323
0
08 Aug 2018
Convergence guarantees for RMSProp and ADAM in non-convex optimization
  and an empirical comparison to Nesterov acceleration
Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration
Soham De
Anirbit Mukherjee
Enayat Ullah
52
101
0
18 Jul 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
Rachel A. Ward
Xiaoxia Wu
Léon Bottou
ODL
64
364
0
05 Jun 2018
On the Convergence of Stochastic Gradient Descent with Adaptive
  Stepsizes
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
Xiaoyun Li
Francesco Orabona
69
295
0
21 May 2018
WNGrad: Learn the Learning Rate in Gradient Descent
WNGrad: Learn the Learning Rate in Gradient Descent
Xiaoxia Wu
Rachel A. Ward
Léon Bottou
44
87
0
07 Mar 2018
Deep contextualized word representations
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
206
11,549
0
15 Feb 2018
The Marginal Value of Adaptive Gradient Methods in Machine Learning
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
62
1,030
0
23 May 2017
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
894
6,788
0
26 Sep 2016
Optimization Methods for Large-Scale Machine Learning
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
236
3,208
0
15 Jun 2016
Unsupervised Representation Learning with Deep Convolutional Generative
  Adversarial Networks
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Alec Radford
Luke Metz
Soumith Chintala
GAN
OOD
250
14,008
0
19 Nov 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,039
0
22 Dec 2014
ADADELTA: An Adaptive Learning Rate Method
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
150
6,624
0
22 Dec 2012
1