ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.04228
  4. Cited By
SGD with memory: fundamental properties and stochastic acceleration
v1v2 (latest)

SGD with memory: fundamental properties and stochastic acceleration

5 October 2024
Dmitry Yarotsky
Maksim Velikanov
ArXiv (abs)PDFHTML

Papers citing "SGD with memory: fundamental properties and stochastic acceleration"

34 / 34 papers shown
Title
Corner Gradient Descent
Corner Gradient Descent
Dmitry Yarotsky
91
0
0
16 Apr 2025
Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on
  GLMs and multi-index models
Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models
Elizabeth Collins-Woodfin
Courtney Paquette
Elliot Paquette
Inbar Seroussi
47
16
0
17 Aug 2023
A view of mini-batch SGD via generating functions: conditions of
  convergence, phase transitions, benefit from negative momenta
A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Maksim Velikanov
Denis Kuznedelev
Dmitry Yarotsky
75
8
0
22 Jun 2022
Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence
  in High Dimensions
Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions
Kiwon Lee
Andrew N. Cheng
Courtney Paquette
Elliot Paquette
87
14
0
02 Jun 2022
Homogenization of SGD in high-dimensions: Exact dynamics and
  generalization properties
Homogenization of SGD in high-dimensions: Exact dynamics and generalization properties
Courtney Paquette
Elliot Paquette
Ben Adlam
Jeffrey Pennington
59
22
0
14 May 2022
More Than a Toy: Random Matrix Models Predict How Real-World Neural
  Representations Generalize
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize
Alexander Wei
Wei Hu
Jacob Steinhardt
109
72
0
11 Mar 2022
Accelerated SGD for Non-Strongly-Convex Least Squares
Accelerated SGD for Non-Strongly-Convex Least Squares
Aditya Varre
Nicolas Flammarion
68
7
0
03 Mar 2022
Tight Convergence Rate Bounds for Optimization Under Power Law Spectral
  Conditions
Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions
Maksim Velikanov
Dmitry Yarotsky
95
8
0
02 Feb 2022
Neural Networks as Kernel Learners: The Silent Alignment Effect
Neural Networks as Kernel Learners: The Silent Alignment Effect
Alexander B. Atanasov
Blake Bordelon
Cengiz Pehlevan
MLT
116
85
0
29 Oct 2021
Last Iterate Risk Bounds of SGD with Decaying Stepsize for
  Overparameterized Linear Regression
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
Sham Kakade
168
22
0
12 Oct 2021
What can linearized neural networks actually say about generalization?
What can linearized neural networks actually say about generalization?
Guillermo Ortiz-Jiménez
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
79
45
0
12 Jun 2021
Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models
Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models
Courtney Paquette
Elliot Paquette
ODL
98
14
0
07 Jun 2021
Generalization Error Rates in Kernel Regression: The Crossover from the
  Noiseless to Noisy Regime
Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime
Hugo Cui
Bruno Loureiro
Florent Krzakala
Lenka Zdeborová
88
85
0
31 May 2021
Benign Overfitting of Constant-Stepsize SGD for Linear Regression
Benign Overfitting of Constant-Stepsize SGD for Linear Regression
Difan Zou
Jingfeng Wu
Vladimir Braverman
Quanquan Gu
Sham Kakade
80
64
0
23 Mar 2021
Fast Adaptation with Linearized Neural Networks
Fast Adaptation with Linearized Neural Networks
Wesley J. Maddox
Shuai Tang
Pablo G. Moreno
A. Wilson
Andreas C. Damianou
85
32
0
02 Mar 2021
Approximation and Learning with Deep Convolutional Models: a Kernel
  Perspective
Approximation and Learning with Deep Convolutional Models: a Kernel Perspective
A. Bietti
87
30
0
19 Feb 2021
Explaining Neural Scaling Laws
Explaining Neural Scaling Laws
Yasaman Bahri
Ethan Dyer
Jared Kaplan
Jaehoon Lee
Utkarsh Sharma
91
270
0
12 Feb 2021
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize
  Criticality
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality
Courtney Paquette
Kiwon Lee
Fabian Pedregosa
Elliot Paquette
59
35
0
08 Feb 2021
Last iterate convergence of SGD for Least-Squares in the Interpolation
  regime
Last iterate convergence of SGD for Least-Squares in the Interpolation regime
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
100
36
0
05 Feb 2021
Deep learning versus kernel learning: an empirical study of loss
  landscape geometry and the time evolution of the Neural Tangent Kernel
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Stanislav Fort
Gintare Karolina Dziugaite
Mansheej Paul
Sepideh Kharaghani
Daniel M. Roy
Surya Ganguli
109
193
0
28 Oct 2020
Finite Versus Infinite Neural Networks: an Empirical Study
Finite Versus Infinite Neural Networks: an Empirical Study
Jaehoon Lee
S. Schoenholz
Jeffrey Pennington
Ben Adlam
Lechao Xiao
Roman Novak
Jascha Narain Sohl-Dickstein
87
214
0
31 Jul 2020
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel
  Regression and Infinitely Wide Neural Networks
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks
Abdulkadir Canatar
Blake Bordelon
Cengiz Pehlevan
156
190
0
23 Jun 2020
Tight Nonparametric Convergence Rates for Stochastic Gradient Descent
  under the Noiseless Linear Model
Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model
Raphael Berthier
Francis R. Bach
Pierre Gaillard
72
39
0
15 Jun 2020
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Ronen Basri
Meirav Galun
Amnon Geifman
David Jacobs
Yoni Kasten
S. Kritchman
92
186
0
10 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
218
241
0
04 Mar 2020
A Fine-Grained Spectral Perspective on Neural Networks
A Fine-Grained Spectral Perspective on Neural Networks
Greg Yang
Hadi Salman
123
113
0
24 Jul 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
  Descent
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
218
1,112
0
18 Feb 2019
Training Neural Networks as Learning Data-adaptive Kernels: Provable
  Representation and Approximation Benefits
Training Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation Benefits
Xialiang Dou
Tengyuan Liang
MLT
83
42
0
21 Jan 2019
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
111
840
0
19 Dec 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
346
3,226
0
20 Jun 2018
Stochastic Composite Least-Squares Regression with convergence rate
  O(1/n)
Stochastic Composite Least-Squares Regression with convergence rate O(1/n)
Nicolas Flammarion
Francis R. Bach
89
27
0
21 Feb 2017
Harder, Better, Faster, Stronger Convergence Rates for Least-Squares
  Regression
Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
Aymeric Dieuleveut
Nicolas Flammarion
Francis R. Bach
ODL
97
227
0
17 Feb 2016
From Averaging to Acceleration, There is Only a Step-size
From Averaging to Acceleration, There is Only a Step-size
Nicolas Flammarion
Francis R. Bach
114
139
0
07 Apr 2015
Non-strongly-convex smooth stochastic approximation with convergence
  rate O(1/n)
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)
Francis R. Bach
Eric Moulines
125
405
0
10 Jun 2013
1