Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.02833
Cited By
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize
6 April 2022
Ali Kavis
Kfir Y. Levy
V. Cevher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize"
32 / 32 papers shown
Title
On the
O
(
d
K
1
/
4
)
O(\frac{\sqrt{d}}{K^{1/4}})
O
(
K
1/4
d
)
Convergence Rate of AdamW Measured by
ℓ
1
\ell_1
ℓ
1
Norm
Huan Li
Yiming Dong
Zhouchen Lin
0
0
0
17 May 2025
Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization
Amit Attia
Tomer Koren
67
1
0
13 Mar 2025
Adaptive Extrapolated Proximal Gradient Methods with Variance Reduction for Composite Nonconvex Finite-Sum Minimization
Ganzhao Yuan
43
0
0
28 Feb 2025
An Energy-Based Self-Adaptive Learning Rate for Stochastic Gradient Descent: Enhancing Unconstrained Optimization with VAV method
Jiahao Zhang
Christian Moya
Guang Lin
43
0
0
10 Nov 2024
Large Batch Analysis for Adagrad Under Anisotropic Smoothness
Yuxing Liu
Rui Pan
Tong Zhang
26
5
0
21 Jun 2024
Convergence Analysis of Adaptive Gradient Methods under Refined Smoothness and Noise Assumptions
Devyani Maladkar
Ruichen Jiang
Aryan Mokhtari
38
6
0
07 Jun 2024
Achieving Near-Optimal Convergence for Distributed Minimax Optimization with Adaptive Stepsizes
Yan Huang
Xiang Li
Yipeng Shen
Niao He
Jinming Xu
44
1
0
05 Jun 2024
Revisiting Convergence of AdaGrad with Relaxed Assumptions
Yusu Hong
Junhong Lin
28
12
0
21 Feb 2024
AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods
Tim Tsz-Kit Lau
Han Liu
Mladen Kolar
ODL
24
6
0
17 Feb 2024
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
32
7
0
12 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
46
10
0
06 Feb 2024
How Free is Parameter-Free Stochastic Optimization?
Amit Attia
Tomer Koren
ODL
47
4
0
05 Feb 2024
Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers
Ron Dorfman
Naseem Yehya
Kfir Y. Levy
30
2
0
05 Feb 2024
High Probability Convergence of Adam Under Unbounded Gradients and Affine Variance Noise
Yusu Hong
Junhong Lin
25
7
0
03 Nov 2023
High Probability Analysis for Non-Convex Stochastic Optimization with Clipping
Shaojie Li
Yong Liu
35
2
0
25 Jul 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
17
8
0
20 Jul 2023
Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions
Bo Wang
Huishuai Zhang
Zhirui Ma
Wei Chen
34
49
0
29 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
36
15
0
21 May 2023
High Probability Convergence of Stochastic Gradient Methods
Zijian Liu
Ta Duy Nguyen
Thien Hai Nguyen
Alina Ene
Huy Le Nguyen
16
37
0
28 Feb 2023
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance
Amit Attia
Tomer Koren
ODL
22
25
0
17 Feb 2023
Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD
Matthew Faw
Litu Rout
C. Caramanis
Sanjay Shakkottai
16
37
0
13 Feb 2023
Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
Ali Kavis
Stratis Skoulakis
Kimon Antonakopoulos
L. Dadi
V. Cevher
19
15
0
03 Nov 2022
Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
Kimon Antonakopoulos
Ali Kavis
V. Cevher
ODL
26
12
0
03 Nov 2022
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization
Xiang Li
Junchi Yang
Niao He
26
8
0
31 Oct 2022
Parameter-free Regret in High Probability with Heavy Tails
Jiujia Zhang
Ashok Cutkosky
14
20
0
25 Oct 2022
PAC-Bayesian Learning of Optimization Algorithms
Michael Sucker
Peter Ochs
24
4
0
20 Oct 2022
META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for Unbounded Functions
Zijian Liu
Ta Duy Nguyen
Thien Hai Nguyen
Alina Ene
Huy Le Nguyen
31
6
0
29 Sep 2022
On the Convergence of AdaGrad(Norm) on
R
d
\R^{d}
R
d
: Beyond Convexity, Non-Asymptotic Rate and Acceleration
Zijian Liu
Ta Duy Nguyen
Alina Ene
Huy Le Nguyen
38
6
0
29 Sep 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Junchi Yang
Xiang Li
Niao He
ODL
27
22
0
01 Jun 2022
The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded Gradients and Affine Variance
Matthew Faw
Isidoros Tziotis
C. Caramanis
Aryan Mokhtari
Sanjay Shakkottai
Rachel A. Ward
14
58
0
11 Feb 2022
A High Probability Analysis of Adaptive SGD with Momentum
Xiaoyun Li
Francesco Orabona
92
65
0
28 Jul 2020
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu
Yura Malitsky
P. Mertikopoulos
V. Cevher
ODL
48
42
0
21 Mar 2020
1