ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.03941
  4. Cited By
Provable Acceleration of Nesterov's Accelerated Gradient Method over
  Heavy Ball Method in Training Over-Parameterized Neural Networks

Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks

8 August 2022
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
    ODL
ArXivPDFHTML

Papers citing "Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks"

23 / 23 papers shown
Title
Parametric estimation of stochastic differential equations via online
  gradient descent
Parametric estimation of stochastic differential equations via online gradient descent
Shogo H. Nakakita
48
2
0
17 Oct 2022
Provable Convergence of Nesterov's Accelerated Gradient Method for
  Over-Parameterized Neural Networks
Provable Convergence of Nesterov's Accelerated Gradient Method for Over-Parameterized Neural Networks
Xin Liu
Zhisong Pan
Wei Tao
129
9
0
05 Jul 2021
Revisiting the Role of Euler Numerical Integration on Acceleration and
  Stability in Convex Optimization
Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization
Peiyuan Zhang
Antonio Orvieto
Hadi Daneshmand
Thomas Hofmann
Roy S. Smith
43
9
0
23 Feb 2021
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers
  Suffice Across Batch Sizes
A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes
Zachary Nado
Justin M. Gilmer
Christopher J. Shallue
Rohan Anil
George E. Dahl
ODL
60
27
0
12 Feb 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU
  Networks with Linear Widths
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
Quynh N. Nguyen
90
48
0
24 Jan 2021
A Dynamical View on Optimization Algorithms of Overparameterized Neural
  Networks
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
54
18
0
25 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
28
23
0
04 Oct 2020
Descending through a Crowded Valley - Benchmarking Deep Learning
  Optimizers
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
79
166
0
03 Jul 2020
The Recurrent Neural Tangent Kernel
The Recurrent Neural Tangent Kernel
Sina Alemohammad
Zichao Wang
Randall Balestriero
Richard Baraniuk
AAML
49
78
0
18 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
749
41,932
0
28 May 2020
Why Do Deep Residual Networks Generalize Better than Deep Feedforward
  Networks? -- A Neural Tangent Kernel Perspective
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? -- A Neural Tangent Kernel Perspective
Kaixuan Huang
Yuqing Wang
Molei Tao
T. Zhao
MLT
51
97
0
14 Feb 2020
Enhanced Convolutional Neural Tangent Kernels
Enhanced Convolutional Neural Tangent Kernels
Zhiyuan Li
Ruosong Wang
Dingli Yu
S. Du
Wei Hu
Ruslan Salakhutdinov
Sanjeev Arora
62
132
0
03 Nov 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Zhao Song
Xin Yang
60
91
0
09 Jun 2019
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph
  Kernels
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels
S. Du
Kangcheng Hou
Barnabás Póczós
Ruslan Salakhutdinov
Ruosong Wang
Keyulu Xu
130
276
0
30 May 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
218
923
0
26 Apr 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
195
972
0
24 Jan 2019
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
192
1,135
0
09 Nov 2018
Understanding the Acceleration Phenomenon via High-Resolution
  Differential Equations
Understanding the Acceleration Phenomenon via High-Resolution Differential Equations
Bin Shi
S. Du
Michael I. Jordan
Weijie J. Su
53
259
0
21 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
214
1,272
0
04 Oct 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
267
3,195
0
20 Jun 2018
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning
  Algorithms
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
Han Xiao
Kashif Rasul
Roland Vollgraf
280
8,878
0
25 Aug 2017
A Differential Equation for Modeling Nesterov's Accelerated Gradient
  Method: Theory and Insights
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
160
1,166
0
04 Mar 2015
Deep Learning in Neural Networks: An Overview
Deep Learning in Neural Networks: An Overview
Jürgen Schmidhuber
HAI
243
16,354
0
30 Apr 2014
1