On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

8 August 2018

Papers citing "On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization"

50 / 72 papers shown

Title
Sharp higher order convergence rates for the Adam optimizer Steffen Dereich Arnulf Jentzen Adrian Riekert ODL 61 0 0 28 Apr 2025
A Langevin sampling algorithm inspired by the Adam optimizer B. Leimkuhler René Lohmann P. Whalley 79 0 0 26 Apr 2025
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective Xianliang Li Jun Luo Zhiwei Zheng Hanxiao Wang Li Luo Lingkun Wen Linlong Wu Sheng Xu 74 0 0 29 Nov 2024
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics Thomas Robert M. Safaryan Ionut-Vlad Modoranu Dan Alistarh ODL 38 2 0 21 Oct 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information Damien Martins Gomes Yanlei Zhang Eugene Belilovsky Guy Wolf Mahdi S. Hosseini ODL 78 2 0 26 May 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks Matteo Tucat Anirbit Mukherjee Procheta Sen Mingfei Sun Omar Rivasplata MLT 39 1 0 12 Apr 2024
$Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization$ Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization Shuo Xie Zhiyuan Li OffRL 55 13 0 05 Apr 2024
Conjugate-Gradient-like Based Adaptive Moment Estimation Optimization Algorithm for Deep Learning Jiawu Tian Liwei Xu Xiaowei Zhang Yongqi Li ODL 56 0 0 02 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance Qi Zhang Yi Zhou Shaofeng Zou 42 4 0 01 Apr 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size P. Ostroukhov Aigerim Zhumabayeva Chulu Xiang Alexander Gasnikov Martin Takáč Dmitry Kamzolov ODL 46 2 0 07 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions Yusu Hong Junhong Lin 51 12 0 06 Feb 2024
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers Yineng Chen Z. Li Lefei Zhang Bo Du Hai Zhao 38 4 0 02 Jul 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods Junchi Yang Xiang Li Ilyas Fatkhullin Niao He 47 15 0 21 May 2023
Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks with Soft-Thresholding Chunyan Xiong Meng Lu Xiaotong Yu JIAN-PENG Cao Zhong Chen D. Guo X. Qu MLT 43 0 0 14 Apr 2023
Stochastic Variable Metric Proximal Gradient with variance reduction for non-convex composite optimization G. Fort Eric Moulines 46 6 0 02 Jan 2023
Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression Xiaoyun Li Ping Li FedML 39 4 0 25 Nov 2022
Fast Adaptive Federated Bilevel Optimization Feihu Huang FedML 29 7 0 02 Nov 2022
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization Xiang Li Junchi Yang Niao He 34 8 0 31 Oct 2022
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining Wenhan Xian Feihu Huang Heng-Chiao Huang FedML 35 0 0 14 Oct 2022
Robustness to Unbounded Smoothness of Generalized SignSGD M. Crawshaw Mingrui Liu Francesco Orabona Wei Zhang Zhenxun Zhuang AAML 36 66 0 23 Aug 2022
Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One Hideaki Iiduka ODL 38 4 0 21 Aug 2022
Adam Can Converge Without Any Modification On Update Rules Yushun Zhang Congliang Chen Naichen Shi Ruoyu Sun Zhimin Luo 23 63 0 20 Aug 2022
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale Gaoyuan Zhang Songtao Lu Yihua Zhang Xiangyi Chen Pin-Yu Chen Quanfu Fan Lee Martie L. Horesh Min-Fong Hong Sijia Liu OOD 35 12 0 13 Jun 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization Junchi Yang Xiang Li Niao He ODL 45 22 0 01 Jun 2022
Efficient-Adam: Communication-Efficient Distributed Adam Congliang Chen Li Shen Wei Liu Zhi-Quan Luo 34 19 0 28 May 2022
Communication-Efficient Adaptive Federated Learning Yujia Wang Lu Lin Jinghui Chen FedML 27 71 0 05 May 2022
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize Ali Kavis Kfir Y. Levy V. Cevher 25 41 0 06 Apr 2022
An Adaptive Gradient Method with Energy and Momentum Hailiang Liu Xuping Tian ODL 21 9 0 23 Mar 2022
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam Yucheng Lu Conglong Li Minjia Zhang Christopher De Sa Yuxiong He OffRL AI4CE 29 20 0 12 Feb 2022
Understanding AdamW through Proximal Methods and Scale-Freeness Zhenxun Zhuang Mingrui Liu Ashok Cutkosky Francesco Orabona 39 63 0 31 Jan 2022
A Stochastic Bundle Method for Interpolating Networks Alasdair Paren Leonard Berrada Rudra P. K. Poudel M. P. Kumar 26 4 0 29 Jan 2022
Communication-Efficient TeraByte-Scale Model Training Framework for Online Advertising Weijie Zhao Xuewu Jiao Mingqing Hu Xiaoyun Li Xinming Zhang Ping Li 3DV 40 8 0 05 Jan 2022
Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization Hideaki Iiduka 15 0 0 14 Dec 2021
A Novel Convergence Analysis for Algorithms of the Adam Family Zhishuai Guo Yi Tian Xu W. Yin Rong Jin Tianbao Yang 39 48 0 07 Dec 2021
Adaptive Differentially Private Empirical Risk Minimization Xiaoxia Wu Lingxiao Wang Irina Cristali Quanquan Gu Rebecca Willett 40 6 0 14 Oct 2021
Stochastic Anderson Mixing for Nonconvex Stochastic Optimization Fu Wei Chenglong Bao Yang Liu 33 19 0 04 Oct 2021
On the Convergence of Decentralized Adaptive Gradient Methods Xiangyi Chen Belhal Karimi Weijie Zhao Ping Li 23 21 0 07 Sep 2021
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size Hideaki Iiduka 26 2 0 26 Aug 2021
A New Adaptive Gradient Method with Gradient Decomposition Zhou Shao Tong Lin ODL 13 0 0 18 Jul 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity A. Davtyan Sepehr Sameni L. Cerkezi Givi Meishvili Adam Bielski Paolo Favaro ODL 58 2 0 07 Jul 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective Kushal Chakrabarti Nikhil Chopra ODL AI4CE 40 9 0 31 May 2021
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks Dong-Young Lim Sotirios Sabanis 39 11 0 28 May 2021
CADA: Communication-Adaptive Distributed Adam Tianyi Chen Ziye Guo Yuejiao Sun W. Yin ODL 14 24 0 31 Dec 2020
SMG: A Shuffling Gradient-Based Method with Momentum Trang H. Tran Lam M. Nguyen Quoc Tran-Dinh 23 21 0 24 Nov 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks Zhiqi Bu Shiyun Xu Kan Chen 38 17 0 25 Oct 2020
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms Chao Ma Lei Wu E. Weinan ODL 19 23 0 14 Sep 2020
Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization Tianyi Chen Yuejiao Sun W. Yin 48 81 0 25 Aug 2020
A High Probability Analysis of Adaptive SGD with Momentum Xiaoyun Li Francesco Orabona 92 66 0 28 Jul 2020
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities Alina Ene Huy Le Nguyen Adrian Vladu ODL 30 28 0 17 Jul 2020
A General Family of Stochastic Proximal Gradient Methods for Deep Learning Jihun Yun A. Lozano Eunho Yang 22 12 0 15 Jul 2020