Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.08114
Cited By
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
21 May 2018
Xiaoyun Li
Francesco Orabona
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes"
19 / 19 papers shown
Title
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Siyuan Yu
Wei Chen
H. V. Poor
63
0
0
17 Jun 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
64
5
0
06 Jun 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
54
1
0
05 Mar 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
76
13
0
06 Feb 2024
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
52
2,482
0
19 Apr 2019
Online Adaptive Methods, Universality and Acceleration
Kfir Y. Levy
A. Yurtsever
Volkan Cevher
ODL
57
89
0
08 Sep 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
Rachel A. Ward
Xiaoxia Wu
Léon Bottou
ODL
50
365
0
05 Jun 2018
WNGrad: Learn the Learning Rate in Gradient Descent
Xiaoxia Wu
Rachel A. Ward
Léon Bottou
41
87
0
07 Mar 2018
Black-Box Reductions for Parameter-free Online Learning in Banach Spaces
Ashok Cutkosky
Francesco Orabona
69
145
0
17 Feb 2018
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark Schmidt
221
1,208
0
16 Aug 2016
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
173
3,198
0
15 Jun 2016
Coin Betting and Parameter-Free Online Learning
Francesco Orabona
D. Pál
93
165
0
12 Feb 2016
Scale-Free Online Learning
Francesco Orabona
D. Pál
46
103
0
08 Jan 2016
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
808
149,474
0
22 Dec 2014
Optimization, Learning, and Games with Predictable Sequences
Alexander Rakhlin
Karthik Sridharan
54
377
0
08 Nov 2013
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming
Saeed Ghadimi
Guanghui Lan
ODL
71
1,538
0
22 Sep 2013
Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization
Julien Mairal
76
160
0
19 Jun 2013
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
108
6,619
0
22 Dec 2012
Optimal Distributed Online Prediction using Mini-Batches
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
241
683
0
07 Dec 2010
1