v1v2 (latest)

Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation

3 April 2024

Papers citing "Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation"

26 / 26 papers shown

Title
Directional Smoothness and Gradient Methods: Convergence and Adaptivity Aaron Mishkin Ahmed Khaled Yuanhao Wang Aaron Defazio Robert Mansel Gower 133 9 0 06 Mar 2024
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize Ryan DÓrazio Nicolas Loizou I. Laradji Ioannis Mitliagkas 135 31 0 28 Oct 2021
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent Sharan Vaswani Benjamin Dubois-Taine Reza Babanezhad 98 13 0 21 Oct 2021
A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip Mathieu Even Raphael Berthier Francis R. Bach Nicolas Flammarion Pierre Gaillard Hadrien Hendrikx Laurent Massoulié Adrien B. Taylor 124 20 0 10 Jun 2021
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation M. Belkin 73 186 0 29 May 2021
Last iterate convergence of SGD for Least-Squares in the Interpolation regime Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 108 36 0 05 Feb 2021
Improved Complexities for Stochastic Conditional Gradient Methods under Interpolation-like Conditions Tesi Xiao Krishnakumar Balasubramanian Saeed Ghadimi 70 2 0 15 Jun 2020
Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search) Sharan Vaswani I. Laradji Frederik Kunstner S. Meng Mark Schmidt Simon Lacoste-Julien 142 27 0 11 Jun 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks Chaoyue Liu Libin Zhu M. Belkin ODL 101 266 0 29 Feb 2020
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings Mahmoud Assran Michael G. Rabbat 78 59 0 27 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence Nicolas Loizou Sharan Vaswani I. Laradji Simon Lacoste-Julien 105 188 0 24 Feb 2020
Lower Bounds for Non-Convex Stochastic Optimization Yossi Arjevani Y. Carmon John C. Duchi Dylan J. Foster Nathan Srebro Blake E. Woodworth 124 362 0 05 Dec 2019
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation S. Meng Sharan Vaswani I. Laradji Mark Schmidt Simon Lacoste-Julien 98 34 0 11 Oct 2019
Training Neural Networks for and by Interpolation Leonard Berrada Andrew Zisserman M. P. Kumar 3DH 74 63 0 13 Jun 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates Sharan Vaswani Aaron Mishkin I. Laradji Mark Schmidt Gauthier Gidel Simon Lacoste-Julien ODL 111 210 0 24 May 2019
Reconciling modern machine learning practice and the bias-variance trade-off M. Belkin Daniel J. Hsu Siyuan Ma Soumik Mandal 293 1,665 0 28 Dec 2018
Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path? Samet Oymak Mahdi Soltanolkotabi ODL 73 177 0 25 Dec 2018
On the Ineffectiveness of Variance Reduced Optimization for Deep Learning Aaron Defazio Léon Bottou UQCV DRL 93 113 0 11 Dec 2018
On exponential convergence of SGD in non-convex over-parametrized learning Xinhai Liu M. Belkin Yu-Shen Liu 88 103 0 06 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron Sharan Vaswani Francis R. Bach Mark Schmidt 116 301 0 16 Oct 2018
Stochastic (Approximate) Proximal Point Methods: Convergence, Optimality, and Adaptivity Hilal Asi John C. Duchi 172 125 0 12 Oct 2018
Does data interpolation contradict statistical optimality? M. Belkin Alexander Rakhlin Alexandre B. Tsybakov 98 221 0 25 Jun 2018
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization Sanjeev Arora Nadav Cohen Elad Hazan 132 488 0 19 Feb 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning Siyuan Ma Raef Bassily M. Belkin 117 291 0 18 Dec 2017
Understanding deep learning requires rethinking generalization Chiyuan Zhang Samy Bengio Moritz Hardt Benjamin Recht Oriol Vinyals HAI 370 4,639 0 10 Nov 2016
Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization Mark Schmidt Nicolas Le Roux Francis R. Bach 261 584 0 12 Sep 2011