Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.04339
Cited By
v1
v2
v3 (latest)
Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives
9 March 2020
Zhishuai Guo
Yan Yan
Tianbao Yang
MoMe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives"
15 / 15 papers shown
Title
On exponential convergence of SGD in non-convex over-parametrized learning
Xinhai Liu
M. Belkin
Yu-Shen Liu
70
103
0
06 Nov 2018
A Unified Analysis of Stochastic Momentum Methods for Deep Learning
Yan Yan
Tianbao Yang
Zhe Li
Qihang Lin
Yi Yang
38
120
0
30 Aug 2018
Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions
Zaiyi Chen
Zhuoning Yuan
Jinfeng Yi
Bowen Zhou
Enhong Chen
Tianbao Yang
51
58
0
20 Aug 2018
Stochastic subgradient method converges at the rate
O
(
k
−
1
/
4
)
O(k^{-1/4})
O
(
k
−
1/4
)
on weakly convex functions
Damek Davis
Dmitriy Drusvyatskiy
77
101
0
08 Feb 2018
Proximally Guided Stochastic Subgradient Method for Nonsmooth, Nonconvex Problems
Damek Davis
Benjamin Grimmer
53
113
0
12 Jul 2017
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark Schmidt
280
1,220
0
16 Aug 2016
Stochastic Variance Reduction for Nonconvex Optimization
Sashank J. Reddi
Ahmed S. Hefny
S. Sra
Barnabás Póczós
Alex Smola
101
604
0
19 Mar 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,322
0
10 Dec 2015
Train faster, generalize better: Stability of stochastic gradient descent
Moritz Hardt
Benjamin Recht
Y. Singer
116
1,242
0
03 Sep 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.9K
150,260
0
22 Dec 2014
Deep learning with Elastic Averaging SGD
Sixin Zhang
A. Choromańska
Yann LeCun
FedML
96
611
0
20 Dec 2014
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming
Saeed Ghadimi
Guanghui Lan
ODL
122
1,555
0
22 Sep 2013
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark Schmidt
Francis R. Bach
185
260
0
10 Dec 2012
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
153
576
0
08 Dec 2012
Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization
Alexander Rakhlin
Ohad Shamir
Karthik Sridharan
169
768
0
26 Sep 2011
1