ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.02519
  4. Cited By
Improved Analysis of Clipping Algorithms for Non-convex Optimization

Improved Analysis of Clipping Algorithms for Non-convex Optimization

5 October 2020
Bohang Zhang
Jikai Jin
Cong Fang
Liwei Wang
ArXivPDFHTML

Papers citing "Improved Analysis of Clipping Algorithms for Non-convex Optimization"

12 / 12 papers shown
Title
AdamS: Momentum Itself Can Be A Normalizer for LLM Pretraining and Post-training
AdamS: Momentum Itself Can Be A Normalizer for LLM Pretraining and Post-training
Huishuai Zhang
Bohan Wang
Luoxin Chen
ODL
85
0
0
22 May 2025
Extended convexity and smoothness and their applications in deep learning
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
68
0
0
08 Oct 2024
An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
Xiaochuan Gong
Jie Hao
Mingrui Liu
85
2
0
28 Sep 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
77
5
0
01 Apr 2024
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Aaron Mishkin
Ahmed Khaled
Yuanhao Wang
Aaron Defazio
Robert Mansel Gower
62
7
0
06 Mar 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
69
13
0
06 Feb 2024
Regularized Q-Learning with Linear Function Approximation
Regularized Q-Learning with Linear Function Approximation
Jiachen Xi
Alfredo Garcia
P. Momcilovic
67
2
0
26 Jan 2024
Why are Adaptive Methods Good for Attention Models?
Why are Adaptive Methods Good for Attention Models?
J.N. Zhang
Sai Praneeth Karimireddy
Andreas Veit
Seungyeon Kim
Sashank J. Reddi
Surinder Kumar
S. Sra
75
80
0
06 Dec 2019
Quasi-hyperbolic momentum and Adam for deep learning
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
110
129
0
16 Oct 2018
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path
  Integrated Differential Estimator
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Cong Fang
C. J. Li
Zhouchen Lin
Tong Zhang
79
572
0
04 Jul 2018
Snapshot Ensembles: Train 1, get M for free
Snapshot Ensembles: Train 1, get M for free
Gao Huang
Yixuan Li
Geoff Pleiss
Zhuang Liu
John E. Hopcroft
Kilian Q. Weinberger
OOD
FedML
UQCV
98
938
0
01 Apr 2017
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic
  Programming
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming
Saeed Ghadimi
Guanghui Lan
ODL
48
1,538
0
22 Sep 2013
1