Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.01811
Cited By
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
5 June 2018
Rachel A. Ward
Xiaoxia Wu
Léon Bottou
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AdaGrad stepsizes: Sharp convergence over nonconvex landscapes"
50 / 53 papers shown
Title
Observability conditions for neural state-space models with eigenvalues and their roots of unity
Andrew Gracyk
125
0
0
22 Apr 2025
Adaptive Extrapolated Proximal Gradient Methods with Variance Reduction for Composite Nonconvex Finite-Sum Minimization
Ganzhao Yuan
43
0
0
28 Feb 2025
Sparklen: A Statistical Learning Toolkit for High-Dimensional Hawkes Processes in Python
Romain Edmond Lacoste
GP
58
0
0
26 Feb 2025
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
42
2
0
05 Jul 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
55
4
0
06 Jun 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
39
1
0
12 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
42
3
0
01 Apr 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
30
1
0
05 Mar 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
P. Ostroukhov
Aigerim Zhumabayeva
Chulu Xiang
Alexander Gasnikov
Martin Takáč
Dmitry Kamzolov
ODL
43
2
0
07 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
46
10
0
06 Feb 2024
How Free is Parameter-Free Stochastic Optimization?
Amit Attia
Tomer Koren
ODL
47
4
0
05 Feb 2024
Bridging Classical and Quantum Machine Learning: Knowledge Transfer From Classical to Quantum Neural Networks Using Knowledge Distillation
Mohammad Junayed Hasan
M.R.C. Mahdy
29
1
0
23 Nov 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
26
0
0
19 Oct 2023
Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions
Bo Wang
Huishuai Zhang
Zhirui Ma
Wei Chen
34
49
0
29 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
39
15
0
21 May 2023
Adaptive Federated Learning via New Entropy Approach
Shensheng Zheng
Wenhao Yuan
Xuehe Wang
Ling-Yu Duan
FedML
OOD
28
1
0
27 Mar 2023
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance
Amit Attia
Tomer Koren
ODL
22
25
0
17 Feb 2023
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Maor Ivgi
Oliver Hinder
Y. Carmon
ODL
26
56
0
08 Feb 2023
Learning-Rate-Free Learning by D-Adaptation
Aaron Defazio
Konstantin Mishchenko
30
77
0
18 Jan 2023
Differentially Private Adaptive Optimization with Delayed Preconditioners
Tian Li
Manzil Zaheer
Ziyu Liu
Sashank J. Reddi
H. B. McMahan
Virginia Smith
42
10
0
01 Dec 2022
Adaptive Stochastic Optimisation of Nonconvex Composite Objectives
Weijia Shao
F. Sivrikaya
S. Albayrak
16
0
0
21 Nov 2022
Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
Kimon Antonakopoulos
Ali Kavis
V. Cevher
ODL
26
12
0
03 Nov 2022
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization
Xiang Li
Junchi Yang
Niao He
26
8
0
31 Oct 2022
On the fast convergence of minibatch heavy ball momentum
Raghu Bollapragada
Tyler Chen
Rachel A. Ward
26
17
0
15 Jun 2022
Supervised Dictionary Learning with Auxiliary Covariates
Joo-Hyun Lee
Hanbaek Lyu
W. Yao
27
1
0
14 Jun 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Junchi Yang
Xiang Li
Niao He
ODL
27
22
0
01 Jun 2022
Efficient-Adam: Communication-Efficient Distributed Adam
Congliang Chen
Li Shen
Wei Liu
Zhi-Quan Luo
28
19
0
28 May 2022
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize
Ali Kavis
Kfir Y. Levy
V. Cevher
22
38
0
06 Apr 2022
Adaptive Gradient Methods with Local Guarantees
Zhou Lu
Wenhan Xia
Sanjeev Arora
Elad Hazan
ODL
27
9
0
02 Mar 2022
A Novel Convergence Analysis for Algorithms of the Adam Family
Zhishuai Guo
Yi Tian Xu
W. Yin
R. L. Jin
Tianbao Yang
39
47
0
07 Dec 2021
Adaptive Differentially Private Empirical Risk Minimization
Xiaoxia Wu
Lingxiao Wang
Irina Cristali
Quanquan Gu
Rebecca Willett
38
6
0
14 Oct 2021
On the Convergence of Decentralized Adaptive Gradient Methods
Xiangyi Chen
Belhal Karimi
Weijie Zhao
Ping Li
21
21
0
07 Sep 2021
A Decentralized Federated Learning Framework via Committee Mechanism with Convergence Guarantee
Chunjiang Che
Xiaoli Li
Chuan Chen
Xiaoyu He
Zibin Zheng
FedML
33
72
0
01 Aug 2021
A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max Optimization Problems
Babak Barazandeh
Tianjian Huang
George Michailidis
24
12
0
10 Jun 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Kushal Chakrabarti
Nikhil Chopra
ODL
AI4CE
37
9
0
31 May 2021
A Knowledge Graph-Enhanced Tensor Factorisation Model for Discovering Drug Targets
Cheng Ye
Rowan Swiers
Stephen Bonner
Ian P Barrett
38
12
0
20 May 2021
Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis
Stephan Wojtowytsch
25
50
0
04 May 2021
Variance Reduced Training with Stratified Sampling for Forecasting Models
Yucheng Lu
Youngsuk Park
Lifan Chen
Bernie Wang
Christopher De Sa
Dean Phillips Foster
AI4TS
38
17
0
02 Mar 2021
Scalable Balanced Training of Conditional Generative Adversarial Neural Networks on Image Data
Massimiliano Lupo Pasini
Vittorio Gabbi
Junqi Yin
S. Perotto
N. Laanait
GAN
AI4CE
24
3
0
21 Feb 2021
Block majorization-minimization with diminishing radius for constrained nonconvex optimization
Hanbaek Lyu
Yuchen Li
21
10
0
07 Dec 2020
Sequential convergence of AdaGrad algorithm for smooth convex optimization
Cheik Traoré
Edouard Pauwels
14
21
0
24 Nov 2020
Learning explanations that are hard to vary
Giambattista Parascandolo
Alexander Neitz
Antonio Orvieto
Luigi Gresele
Bernhard Schölkopf
FAtt
19
178
0
01 Sep 2020
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities
Alina Ene
Huy Le Nguyen
Adrian Vladu
ODL
30
28
0
17 Jul 2020
Incremental Without Replacement Sampling in Nonconvex Optimization
Edouard Pauwels
38
5
0
15 Jul 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
25
74
0
18 Jun 2020
Optimal Complexity in Decentralized Training
Yucheng Lu
Christopher De Sa
38
72
0
15 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
27
181
0
24 Feb 2020
Low Rank Saddle Free Newton: A Scalable Method for Stochastic Nonconvex Optimization
Thomas O'Leary-Roseberry
Nick Alger
Omar Ghattas
ODL
37
9
0
07 Feb 2020
Why gradient clipping accelerates training: A theoretical justification for adaptivity
Junzhe Zhang
Tianxing He
S. Sra
Ali Jadbabaie
30
442
0
28 May 2019
1
2
Next