ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.02941
  4. Cited By
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex
  Optimization

On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

8 August 2018
Xiangyi Chen
Sijia Liu
Ruoyu Sun
Mingyi Hong
ArXivPDFHTML

Papers citing "On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization"

50 / 72 papers shown
Title
Sharp higher order convergence rates for the Adam optimizer
Sharp higher order convergence rates for the Adam optimizer
Steffen Dereich
Arnulf Jentzen
Adrian Riekert
ODL
61
0
0
28 Apr 2025
A Langevin sampling algorithm inspired by the Adam optimizer
A Langevin sampling algorithm inspired by the Adam optimizer
B. Leimkuhler
René Lohmann
P. Whalley
79
0
0
26 Apr 2025
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
Xianliang Li
Jun Luo
Zhiwei Zheng
Hanxiao Wang
Li Luo
Lingkun Wen
Linlong Wu
Sheng Xu
74
0
0
29 Nov 2024
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
Thomas Robert
M. Safaryan
Ionut-Vlad Modoranu
Dan Alistarh
ODL
38
2
0
21 Oct 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
78
2
0
26 May 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
39
1
0
12 Apr 2024
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
Implicit Bias of AdamW: ℓ∞\ell_\inftyℓ∞​ Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
55
13
0
05 Apr 2024
Conjugate-Gradient-like Based Adaptive Moment Estimation Optimization Algorithm for Deep Learning
Conjugate-Gradient-like Based Adaptive Moment Estimation Optimization Algorithm for Deep Learning
Jiawu Tian
Liwei Xu
Xiaowei Zhang
Yongqi Li
ODL
56
0
0
02 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
42
4
0
01 Apr 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
P. Ostroukhov
Aigerim Zhumabayeva
Chulu Xiang
Alexander Gasnikov
Martin Takáč
Dmitry Kamzolov
ODL
46
2
0
07 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
51
12
0
06 Feb 2024
Bidirectional Looking with A Novel Double Exponential Moving Average to
  Adaptive and Non-adaptive Momentum Optimizers
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
38
4
0
02 Jul 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of
  Adaptive Methods
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
47
15
0
21 May 2023
Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks
  with Soft-Thresholding
Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks with Soft-Thresholding
Chunyan Xiong
Meng Lu
Xiaotong Yu
JIAN-PENG Cao
Zhong Chen
D. Guo
X. Qu
MLT
43
0
0
14 Apr 2023
Stochastic Variable Metric Proximal Gradient with variance reduction for
  non-convex composite optimization
Stochastic Variable Metric Proximal Gradient with variance reduction for non-convex composite optimization
G. Fort
Eric Moulines
46
6
0
02 Jan 2023
Analysis of Error Feedback in Federated Non-Convex Optimization with
  Biased Compression
Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression
Xiaoyun Li
Ping Li
FedML
39
4
0
25 Nov 2022
Fast Adaptive Federated Bilevel Optimization
Fast Adaptive Federated Bilevel Optimization
Feihu Huang
FedML
29
7
0
02 Nov 2022
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax
  Optimization
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization
Xiang Li
Junchi Yang
Niao He
34
8
0
31 Oct 2022
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining
Wenhan Xian
Feihu Huang
Heng-Chiao Huang
FedML
35
0
0
14 Oct 2022
Robustness to Unbounded Smoothness of Generalized SignSGD
Robustness to Unbounded Smoothness of Generalized SignSGD
M. Crawshaw
Mingrui Liu
Francesco Orabona
Wei Zhang
Zhenxun Zhuang
AAML
36
66
0
23 Aug 2022
Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of
  Deep Learning Optimizer using Hyperparameters Close to One
Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Hideaki Iiduka
ODL
38
4
0
21 Aug 2022
Adam Can Converge Without Any Modification On Update Rules
Adam Can Converge Without Any Modification On Update Rules
Yushun Zhang
Congliang Chen
Naichen Shi
Ruoyu Sun
Zhimin Luo
23
63
0
20 Aug 2022
Distributed Adversarial Training to Robustify Deep Neural Networks at
  Scale
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale
Gaoyuan Zhang
Songtao Lu
Yihua Zhang
Xiangyi Chen
Pin-Yu Chen
Quanfu Fan
Lee Martie
L. Horesh
Min-Fong Hong
Sijia Liu
OOD
35
12
0
13 Jun 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax
  Optimization
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Junchi Yang
Xiang Li
Niao He
ODL
45
22
0
01 Jun 2022
Efficient-Adam: Communication-Efficient Distributed Adam
Efficient-Adam: Communication-Efficient Distributed Adam
Congliang Chen
Li Shen
Wei Liu
Zhi-Quan Luo
34
19
0
28 May 2022
Communication-Efficient Adaptive Federated Learning
Communication-Efficient Adaptive Federated Learning
Yujia Wang
Lu Lin
Jinghui Chen
FedML
27
71
0
05 May 2022
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad
  Stepsize
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize
Ali Kavis
Kfir Y. Levy
V. Cevher
25
41
0
06 Apr 2022
An Adaptive Gradient Method with Energy and Momentum
An Adaptive Gradient Method with Energy and Momentum
Hailiang Liu
Xuping Tian
ODL
21
9
0
23 Mar 2022
Maximizing Communication Efficiency for Large-scale Training via 0/1
  Adam
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Yucheng Lu
Conglong Li
Minjia Zhang
Christopher De Sa
Yuxiong He
OffRL
AI4CE
29
20
0
12 Feb 2022
Understanding AdamW through Proximal Methods and Scale-Freeness
Understanding AdamW through Proximal Methods and Scale-Freeness
Zhenxun Zhuang
Mingrui Liu
Ashok Cutkosky
Francesco Orabona
39
63
0
31 Jan 2022
A Stochastic Bundle Method for Interpolating Networks
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
26
4
0
29 Jan 2022
Communication-Efficient TeraByte-Scale Model Training Framework for
  Online Advertising
Communication-Efficient TeraByte-Scale Model Training Framework for Online Advertising
Weijie Zhao
Xuewu Jiao
Mingqing Hu
Xiaoyun Li
Xinming Zhang
Ping Li
3DV
40
8
0
05 Jan 2022
Minimization of Stochastic First-order Oracle Complexity of Adaptive
  Methods for Nonconvex Optimization
Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization
Hideaki Iiduka
15
0
0
14 Dec 2021
A Novel Convergence Analysis for Algorithms of the Adam Family
A Novel Convergence Analysis for Algorithms of the Adam Family
Zhishuai Guo
Yi Tian Xu
W. Yin
Rong Jin
Tianbao Yang
39
48
0
07 Dec 2021
Adaptive Differentially Private Empirical Risk Minimization
Adaptive Differentially Private Empirical Risk Minimization
Xiaoxia Wu
Lingxiao Wang
Irina Cristali
Quanquan Gu
Rebecca Willett
40
6
0
14 Oct 2021
Stochastic Anderson Mixing for Nonconvex Stochastic Optimization
Stochastic Anderson Mixing for Nonconvex Stochastic Optimization
Fu Wei
Chenglong Bao
Yang Liu
33
19
0
04 Oct 2021
On the Convergence of Decentralized Adaptive Gradient Methods
On the Convergence of Decentralized Adaptive Gradient Methods
Xiangyi Chen
Belhal Karimi
Weijie Zhao
Ping Li
23
21
0
07 Sep 2021
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning
  Optimizer is a Rational Function of Batch Size
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size
Hideaki Iiduka
26
2
0
26 Aug 2021
A New Adaptive Gradient Method with Gradient Decomposition
A New Adaptive Gradient Method with Gradient Decomposition
Zhou Shao
Tong Lin
ODL
13
0
0
18 Jul 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
A. Davtyan
Sepehr Sameni
L. Cerkezi
Givi Meishvili
Adam Bielski
Paolo Favaro
ODL
58
2
0
07 Jul 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Kushal Chakrabarti
Nikhil Chopra
ODL
AI4CE
40
9
0
31 May 2021
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient
  adaptive algorithms for neural networks
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks
Dong-Young Lim
Sotirios Sabanis
39
11
0
28 May 2021
CADA: Communication-Adaptive Distributed Adam
CADA: Communication-Adaptive Distributed Adam
Tianyi Chen
Ziye Guo
Yuejiao Sun
W. Yin
ODL
14
24
0
31 Dec 2020
SMG: A Shuffling Gradient-Based Method with Momentum
SMG: A Shuffling Gradient-Based Method with Momentum
Trang H. Tran
Lam M. Nguyen
Quoc Tran-Dinh
23
21
0
24 Nov 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural
  Networks
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
38
17
0
25 Oct 2020
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient
  Algorithms
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms
Chao Ma
Lei Wu
E. Weinan
ODL
19
23
0
14 Sep 2020
Solving Stochastic Compositional Optimization is Nearly as Easy as
  Solving Stochastic Optimization
Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization
Tianyi Chen
Yuejiao Sun
W. Yin
48
81
0
25 Aug 2020
A High Probability Analysis of Adaptive SGD with Momentum
A High Probability Analysis of Adaptive SGD with Momentum
Xiaoyun Li
Francesco Orabona
92
66
0
28 Jul 2020
Adaptive Gradient Methods for Constrained Convex Optimization and
  Variational Inequalities
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities
Alina Ene
Huy Le Nguyen
Adrian Vladu
ODL
30
28
0
17 Jul 2020
A General Family of Stochastic Proximal Gradient Methods for Deep
  Learning
A General Family of Stochastic Proximal Gradient Methods for Deep Learning
Jihun Yun
A. Lozano
Eunho Yang
22
12
0
15 Jul 2020
12
Next