ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.17998
  4. Cited By
Closing the Gap Between the Upper Bound and the Lower Bound of Adam's
  Iteration Complexity

Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity

27 October 2023
Bohan Wang
Jingwen Fu
Huishuai Zhang
Nanning Zheng
Wei Chen
ArXivPDFHTML

Papers citing "Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity"

13 / 13 papers shown
Title
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
165
0
0
30 Dec 2024
CAdam: Confidence-Based Optimization for Online Learning
CAdam: Confidence-Based Optimization for Online Learning
Shaowen Wang
Anan Liu
Jian Xiao
Huan Liu
Yuekui Yang
Cong Xu
Qianqian Pu
Suncong Zheng
Wei-Qiang Zhang
Jian Li
76
0
0
29 Nov 2024
Understanding Adam Requires Better Rotation Dependent Assumptions
Understanding Adam Requires Better Rotation Dependent Assumptions
Lucas Maes
Tianyue H. Zhang
Alexia Jolicoeur-Martineau
Ioannis Mitliagkas
Damien Scieur
Simon Lacoste-Julien
Charles Guille-Escuret
38
3
0
25 Oct 2024
An Attention-Based Algorithm for Gravity Adaptation Zone Calibration
An Attention-Based Algorithm for Gravity Adaptation Zone Calibration
Chen Yu
21
0
0
06 Oct 2024
Large Batch Analysis for Adagrad Under Anisotropic Smoothness
Large Batch Analysis for Adagrad Under Anisotropic Smoothness
Yuxing Liu
Rui Pan
Tong Zhang
26
5
0
21 Jun 2024
BAdam: A Memory Efficient Full Parameter Optimization Method for Large
  Language Models
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
Qi Luo
Hengxu Yu
Xiao Li
47
1
0
03 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
42
4
0
01 Apr 2024
On the Convergence of Adam under Non-uniform Smoothness: Separability
  from SGDM and Beyond
On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond
Bohan Wang
Huishuai Zhang
Qi Meng
Ruoyu Sun
Zhi-Ming Ma
Wei Chen
37
7
0
22 Mar 2024
Why Transformers Need Adam: A Hessian Perspective
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Ruoyu Sun
Zhimin Luo
40
43
0
26 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
46
11
0
06 Feb 2024
Understanding Adam Optimizer via Online Learning of Updates: Adam is
  FTRL in Disguise
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Kwangjun Ahn
Zhiyu Zhang
Yunbum Kook
Yan Dai
45
11
0
02 Feb 2024
Convergence of Adam Under Relaxed Assumptions
Convergence of Adam Under Relaxed Assumptions
Haochuan Li
Alexander Rakhlin
Ali Jadbabaie
37
54
0
27 Apr 2023
A Simple Convergence Proof of Adam and Adagrad
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
56
144
0
05 Mar 2020
1