ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.09358
  4. Cited By
A Sufficient Condition for Convergences of Adam and RMSProp

A Sufficient Condition for Convergences of Adam and RMSProp

23 November 2018
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
ArXivPDFHTML

Papers citing "A Sufficient Condition for Convergences of Adam and RMSProp"

37 / 37 papers shown
Title
Sharp higher order convergence rates for the Adam optimizer
Sharp higher order convergence rates for the Adam optimizer
Steffen Dereich
Arnulf Jentzen
Adrian Riekert
ODL
61
0
0
28 Apr 2025
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
39
1
0
12 Apr 2024
Variational Stochastic Gradient Descent for Deep Neural Networks
Variational Stochastic Gradient Descent for Deep Neural Networks
Haotian Chen
Anna Kuzina
Babak Esmaeili
Jakub M. Tomczak
52
0
0
09 Apr 2024
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
Implicit Bias of AdamW: ℓ∞\ell_\inftyℓ∞​ Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
47
13
0
05 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
42
3
0
01 Apr 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
46
10
0
06 Feb 2024
Efficient Federated Learning via Local Adaptive Amended Optimizer with
  Linear Speedup
Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup
Yan Sun
Li Shen
Hao Sun
Liang Ding
Dacheng Tao
FedML
24
17
0
30 Jul 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of
  Adaptive Methods
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
36
15
0
21 May 2023
Mnemosyne: Learning to Train Transformers with Transformers
Mnemosyne: Learning to Train Transformers with Transformers
Deepali Jain
K. Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan
OffRL
39
9
0
02 Feb 2023
Probabilistic Bilevel Coreset Selection
Probabilistic Bilevel Coreset Selection
Xiao Zhou
Renjie Pi
Weizhong Zhang
Yong Lin
Tong Zhang
NoLa
28
27
0
24 Jan 2023
Model Agnostic Sample Reweighting for Out-of-Distribution Learning
Model Agnostic Sample Reweighting for Out-of-Distribution Learning
Xiao Zhou
Yong Lin
Renjie Pi
Weizhong Zhang
Renzhe Xu
Peng Cui
Tong Zhang
OODD
39
60
0
24 Jan 2023
AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task
  Learning
AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning
Enneng Yang
Junwei Pan
Ximei Wang
Haibin Yu
Li Shen
Xihua Chen
Lei Xiao
Jie Jiang
G. Guo
38
43
0
28 Nov 2022
System Resilience through Health Monitoring and Reconfiguration
System Resilience through Health Monitoring and Reconfiguration
Ion Matei
W. Piotrowski
Alexandre Perez
Johan de Kleer
J. Tierno
Wendy Mungovan
Vance Turnewitsch
31
7
0
30 Aug 2022
Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of
  Deep Learning Optimizer using Hyperparameters Close to One
Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Hideaki Iiduka
ODL
38
4
0
21 Aug 2022
Adam Can Converge Without Any Modification On Update Rules
Adam Can Converge Without Any Modification On Update Rules
Yushun Zhang
Congliang Chen
Naichen Shi
Ruoyu Sun
Zhimin Luo
18
62
0
20 Aug 2022
FRIB: Low-poisoning Rate Invisible Backdoor Attack based on Feature
  Repair
FRIB: Low-poisoning Rate Invisible Backdoor Attack based on Feature Repair
Hui Xia
Xiugui Yang
X. Qian
Rui Zhang
AAML
27
0
0
26 Jul 2022
Analysis, Characterization, Prediction and Attribution of Extreme
  Atmospheric Events with Machine Learning: a Review
Analysis, Characterization, Prediction and Attribution of Extreme Atmospheric Events with Machine Learning: a Review
S. Salcedo-Sanz
Jorge Pérez-Aracil
G. Ascenso
Javier Del Ser
D. Casillas-Pérez
...
D. Barriopedro
R. García-Herrera
Marcello Restelli
M. Giuliani
A. Castelletti
AI4Cl
25
13
0
03 Jun 2022
Efficient-Adam: Communication-Efficient Distributed Adam
Efficient-Adam: Communication-Efficient Distributed Adam
Congliang Chen
Li Shen
Wei Liu
Zhi-Quan Luo
25
19
0
28 May 2022
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad
  Stepsize
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize
Ali Kavis
Kfir Y. Levy
V. Cevher
17
38
0
06 Apr 2022
Maximizing Communication Efficiency for Large-scale Training via 0/1
  Adam
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Yucheng Lu
Conglong Li
Minjia Zhang
Christopher De Sa
Yuxiong He
OffRL
AI4CE
24
20
0
12 Feb 2022
On Maximum-a-Posteriori estimation with Plug & Play priors and
  stochastic gradient descent
On Maximum-a-Posteriori estimation with Plug & Play priors and stochastic gradient descent
R. Laumont
Valentin De Bortoli
Andrés Almansa
J. Delon
Alain Durmus
Marcelo Pereyra
25
25
0
16 Jan 2022
A Novel Convergence Analysis for Algorithms of the Adam Family
A Novel Convergence Analysis for Algorithms of the Adam Family
Zhishuai Guo
Yi Tian Xu
W. Yin
R. L. Jin
Tianbao Yang
39
47
0
07 Dec 2021
Classical-to-Quantum Transfer Learning for Spoken Command Recognition
  Based on Quantum Neural Networks
Classical-to-Quantum Transfer Learning for Spoken Command Recognition Based on Quantum Neural Networks
Jun Qi
Javier Tejedor
39
43
0
17 Oct 2021
Learning with Multiclass AUC: Theory and Algorithms
Learning with Multiclass AUC: Theory and Algorithms
Zhiyong Yang
Qianqian Xu
Shilong Bao
Xiaochun Cao
Qingming Huang
36
67
0
28 Jul 2021
A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max
  Optimization Problems
A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max Optimization Problems
Babak Barazandeh
Tianjian Huang
George Michailidis
24
12
0
10 Jun 2021
Bayesian imaging using Plug & Play priors: when Langevin meets Tweedie
Bayesian imaging using Plug & Play priors: when Langevin meets Tweedie
R. Laumont
Valentin De Bortoli
Andrés Almansa
J. Delon
Alain Durmus
Marcelo Pereyra
24
109
0
08 Mar 2021
Performance Analysis of Optimizers for Plant Disease Classification with
  Convolutional Neural Networks
Performance Analysis of Optimizers for Plant Disease Classification with Convolutional Neural Networks
S. Labhsetwar
Soumya Haridas
Riyali Panmand
Rutuja Deshpande
Piyush Arvind Kolte
Sandhya Pati
19
4
0
08 Nov 2020
A High Probability Analysis of Adaptive SGD with Momentum
A High Probability Analysis of Adaptive SGD with Momentum
Xiaoyun Li
Francesco Orabona
92
65
0
28 Jul 2020
Adaptive Gradient Methods for Constrained Convex Optimization and
  Variational Inequalities
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities
Alina Ene
Huy Le Nguyen
Adrian Vladu
ODL
27
28
0
17 Jul 2020
EPI-based Oriented Relation Networks for Light Field Depth Estimation
EPI-based Oriented Relation Networks for Light Field Depth Estimation
Kunyuan Li
Jun Zhang
Rui Sun
Xudong Zhang
Jun Gao
MDE
28
22
0
09 Jul 2020
Robust Federated Recommendation System
Robust Federated Recommendation System
Chen Chen
Jingfeng Zhang
A. Tung
Mohan S. Kankanhalli
Gang Chen
FedML
44
26
0
15 Jun 2020
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient
  Descent on Bottou-Curtis-Nocedal Functions
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient Descent on Bottou-Curtis-Nocedal Functions
V. Patel
18
23
0
01 Apr 2020
A new regret analysis for Adam-type algorithms
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu
Yura Malitsky
P. Mertikopoulos
V. Cevher
ODL
48
42
0
21 Mar 2020
Generalized Embedding Machines for Recommender Systems
Generalized Embedding Machines for Recommender Systems
Enneng Yang
Xin Xin
Li Shen
G. Guo
21
2
0
16 Feb 2020
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
14
168
0
19 Dec 2019
Demon: Improved Neural Network Training with Momentum Decay
Demon: Improved Neural Network Training with Momentum Decay
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
24
15
0
11 Oct 2019
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
Junzhe Zhang
Tianxing He
S. Sra
Ali Jadbabaie
30
442
0
28 May 2019
1