ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.00143
  4. Cited By
AdaShift: Decorrelation and Convergence of Adaptive Learning Rate
  Methods

AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods

29 September 2018
Zhiming Zhou
Qingru Zhang
Guansong Lu
Hongwei Wang
Weinan Zhang
Yong Yu
ArXivPDFHTML

Papers citing "AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods"

35 / 35 papers shown
Title
On the Convergence of Adam-Type Algorithm for Bilevel Optimization under Unbounded Smoothness
Xiaochuan Gong
Jie Hao
Mingrui Liu
58
0
0
05 Mar 2025
A survey of synthetic data augmentation methods in computer vision
A survey of synthetic data augmentation methods in computer vision
A. Mumuni
F. Mumuni
N. K. Gerrar
42
19
0
15 Mar 2024
Understanding Adam Optimizer via Online Learning of Updates: Adam is
  FTRL in Disguise
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Kwangjun Ahn
Zhiyu Zhang
Yunbum Kook
Yan Dai
45
11
0
02 Feb 2024
Closing the Gap Between the Upper Bound and the Lower Bound of Adam's
  Iteration Complexity
Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity
Bohan Wang
Jingwen Fu
Huishuai Zhang
Nanning Zheng
Wei Chen
18
17
0
27 Oct 2023
An Automatic Learning Rate Schedule Algorithm for Achieving Faster
  Convergence and Steeper Descent
An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent
Zhao Song
Chiwun Yang
36
9
0
17 Oct 2023
A Theoretical and Empirical Study on the Convergence of Adam with an
  "Exact" Constant Step Size in Non-Convex Settings
A Theoretical and Empirical Study on the Convergence of Adam with an "Exact" Constant Step Size in Non-Convex Settings
Alokendu Mazumder
Rishabh Sabharwal
Manan Tayal
Bhartendu Kumar
Punit Rathore
22
0
0
15 Sep 2023
Convergence of Adam Under Relaxed Assumptions
Convergence of Adam Under Relaxed Assumptions
Haochuan Li
Alexander Rakhlin
Ali Jadbabaie
37
55
0
27 Apr 2023
A Theory on Adam Instability in Large-Scale Machine Learning
A Theory on Adam Instability in Large-Scale Machine Learning
Igor Molybog
Peter Albert
Moya Chen
Zach DeVito
David Esiobu
...
Puxin Xu
Yuchen Zhang
Melanie Kambadur
Stephen Roller
Susan Zhang
AI4CE
33
30
0
19 Apr 2023
FedAgg: Adaptive Federated Learning with Aggregated Gradients
FedAgg: Adaptive Federated Learning with Aggregated Gradients
Wenhao Yuan
Xuehe Wang
FedML
48
0
0
28 Mar 2023
Provable Adaptivity of Adam under Non-uniform Smoothness
Provable Adaptivity of Adam under Non-uniform Smoothness
Bohan Wang
Yushun Zhang
Huishuai Zhang
Qi Meng
Ruoyu Sun
Zhirui Ma
Tie-Yan Liu
Zhimin Luo
Wei Chen
30
25
0
21 Aug 2022
Adam Can Converge Without Any Modification On Update Rules
Adam Can Converge Without Any Modification On Update Rules
Yushun Zhang
Congliang Chen
Naichen Shi
Ruoyu Sun
Zhimin Luo
18
63
0
20 Aug 2022
Maximizing Communication Efficiency for Large-scale Training via 0/1
  Adam
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Yucheng Lu
Conglong Li
Minjia Zhang
Christopher De Sa
Yuxiong He
OffRL
AI4CE
26
20
0
12 Feb 2022
Momentum Centering and Asynchronous Update for Adaptive Gradient Methods
Momentum Centering and Asynchronous Update for Adaptive Gradient Methods
Juntang Zhuang
Yifan Ding
Tommy M. Tang
Nicha Dvornek
S. Tatikonda
James S. Duncan
ODL
29
4
0
11 Oct 2021
Follow Your Path: a Progressive Method for Knowledge Distillation
Follow Your Path: a Progressive Method for Knowledge Distillation
Wenxian Shi
Yuxuan Song
Hao Zhou
Bohan Li
Lei Li
17
15
0
20 Jul 2021
A decreasing scaling transition scheme from Adam to SGD
A decreasing scaling transition scheme from Adam to SGD
Kun Zeng
Jinlan Liu
Zhixia Jiang
Dongpo Xu
ODL
15
10
0
12 Jun 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Kushal Chakrabarti
Nikhil Chopra
ODL
AI4CE
40
9
0
31 May 2021
Towards Practical Adam: Non-Convexity, Convergence Theory, and
  Mini-Batch Acceleration
Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Congliang Chen
Li Shen
Fangyu Zou
Wei Liu
46
29
0
14 Jan 2021
Adaptive Gradient Method with Resilience and Momentum
Adaptive Gradient Method with Resilience and Momentum
Jie Liu
Chen Lin
Chuming Li
Lu Sheng
Ming Sun
Junjie Yan
Wanli Ouyang
ODL
16
0
0
21 Oct 2020
GTAdam: Gradient Tracking with Adaptive Momentum for Distributed Online
  Optimization
GTAdam: Gradient Tracking with Adaptive Momentum for Distributed Online Optimization
Guido Carnevale
Francesco Farina
Ivano Notarnicola
G. Notarstefano
6
21
0
03 Sep 2020
Descending through a Crowded Valley - Benchmarking Deep Learning
  Optimizers
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
40
162
0
03 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
22
10
0
30 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
23
21
0
24 Jun 2020
Quantized Adam with Error Feedback
Quantized Adam with Error Feedback
Congliang Chen
Li Shen
Haozhi Huang
Wei Liu
ODL
MQ
8
33
0
29 Apr 2020
AdaX: Adaptive Gradient Descent with Exponential Long Term Memory
AdaX: Adaptive Gradient Descent with Exponential Long Term Memory
Wenjie Li
Zhaoyang Zhang
Xinjiang Wang
Ping Luo
ODL
20
28
0
21 Apr 2020
Why are Adaptive Methods Good for Attention Models?
Why are Adaptive Methods Good for Attention Models?
J.N. Zhang
Sai Praneeth Karimireddy
Andreas Veit
Seungyeon Kim
Sashank J. Reddi
Surinder Kumar
S. Sra
18
79
0
06 Dec 2019
Domain-independent Dominance of Adaptive Methods
Domain-independent Dominance of Adaptive Methods
Pedro H. P. Savarese
David A. McAllester
Sudarshan Babu
Michael Maire
ODL
18
22
0
04 Dec 2019
Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for
  Non Convex Optimization
Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization
Anas Barakat
Pascal Bianchi
15
12
0
18 Nov 2019
Does Adam optimizer keep close to the optimal point?
Does Adam optimizer keep close to the optimal point?
Kiwook Bae
Heechang Ryu
Hayong Shin
ODL
14
17
0
01 Nov 2019
An Adaptive and Momental Bound Method for Stochastic Learning
An Adaptive and Momental Bound Method for Stochastic Learning
Jianbang Ding
Xuancheng Ren
Ruixuan Luo
Xu Sun
ODL
19
46
0
27 Oct 2019
On Empirical Comparisons of Optimizers for Deep Learning
On Empirical Comparisons of Optimizers for Deep Learning
Dami Choi
Christopher J. Shallue
Zachary Nado
Jaehoon Lee
Chris J. Maddison
George E. Dahl
14
256
0
11 Oct 2019
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
30
445
0
28 May 2019
Blockwise Adaptivity: Faster Training and Better Generalization in Deep
  Learning
Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning
Shuai Zheng
James T. Kwok
ODL
19
5
0
23 May 2019
Towards Efficient and Unbiased Implementation of Lipschitz Continuity in
  GANs
Towards Efficient and Unbiased Implementation of Lipschitz Continuity in GANs
Zhiming Zhou
Jian Shen
Yuxuan Song
Weinan Zhang
Yong Yu
26
6
0
02 Apr 2019
A Sufficient Condition for Convergences of Adam and RMSProp
A Sufficient Condition for Convergences of Adam and RMSProp
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
33
364
0
23 Nov 2018
Nostalgic Adam: Weighting more of the past gradients when designing the
  adaptive learning rate
Nostalgic Adam: Weighting more of the past gradients when designing the adaptive learning rate
Haiwen Huang
Changzhang Wang
Bin Dong
ODL
11
59
0
19 May 2018
1