Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.02378
Cited By
v1
v2 (latest)
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
3 April 2024
Aaron Mishkin
Mert Pilanci
Mark Schmidt
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation"
26 / 26 papers shown
Title
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Aaron Mishkin
Ahmed Khaled
Yuanhao Wang
Aaron Defazio
Robert Mansel Gower
133
9
0
06 Mar 2024
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize
Ryan DÓrazio
Nicolas Loizou
I. Laradji
Ioannis Mitliagkas
135
31
0
28 Oct 2021
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Sharan Vaswani
Benjamin Dubois-Taine
Reza Babanezhad
98
13
0
21 Oct 2021
A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip
Mathieu Even
Raphael Berthier
Francis R. Bach
Nicolas Flammarion
Pierre Gaillard
Hadrien Hendrikx
Laurent Massoulié
Adrien B. Taylor
124
20
0
10 Jun 2021
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
M. Belkin
73
186
0
29 May 2021
Last iterate convergence of SGD for Least-Squares in the Interpolation regime
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
108
36
0
05 Feb 2021
Improved Complexities for Stochastic Conditional Gradient Methods under Interpolation-like Conditions
Tesi Xiao
Krishnakumar Balasubramanian
Saeed Ghadimi
70
2
0
15 Jun 2020
Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search)
Sharan Vaswani
I. Laradji
Frederik Kunstner
S. Meng
Mark Schmidt
Simon Lacoste-Julien
142
27
0
11 Jun 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
101
266
0
29 Feb 2020
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
Mahmoud Assran
Michael G. Rabbat
78
59
0
27 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
105
188
0
24 Feb 2020
Lower Bounds for Non-Convex Stochastic Optimization
Yossi Arjevani
Y. Carmon
John C. Duchi
Dylan J. Foster
Nathan Srebro
Blake E. Woodworth
124
362
0
05 Dec 2019
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
S. Meng
Sharan Vaswani
I. Laradji
Mark Schmidt
Simon Lacoste-Julien
98
34
0
11 Oct 2019
Training Neural Networks for and by Interpolation
Leonard Berrada
Andrew Zisserman
M. P. Kumar
3DH
74
63
0
13 Jun 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Sharan Vaswani
Aaron Mishkin
I. Laradji
Mark Schmidt
Gauthier Gidel
Simon Lacoste-Julien
ODL
111
210
0
24 May 2019
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
293
1,665
0
28 Dec 2018
Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?
Samet Oymak
Mahdi Soltanolkotabi
ODL
73
177
0
25 Dec 2018
On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
Aaron Defazio
Léon Bottou
UQCV
DRL
93
113
0
11 Dec 2018
On exponential convergence of SGD in non-convex over-parametrized learning
Xinhai Liu
M. Belkin
Yu-Shen Liu
88
103
0
06 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
Sharan Vaswani
Francis R. Bach
Mark Schmidt
116
301
0
16 Oct 2018
Stochastic (Approximate) Proximal Point Methods: Convergence, Optimality, and Adaptivity
Hilal Asi
John C. Duchi
172
125
0
12 Oct 2018
Does data interpolation contradict statistical optimality?
M. Belkin
Alexander Rakhlin
Alexandre B. Tsybakov
98
221
0
25 Jun 2018
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
Sanjeev Arora
Nadav Cohen
Elad Hazan
132
488
0
19 Feb 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
Siyuan Ma
Raef Bassily
M. Belkin
117
291
0
18 Dec 2017
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
370
4,639
0
10 Nov 2016
Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization
Mark Schmidt
Nicolas Le Roux
Francis R. Bach
261
584
0
12 Sep 2011
1