ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth
  Limit
Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit
Soufiane Hayou
Arnaud Doucet
Judith Rousseau
106
4
0
31 May 2019
What Can Neural Networks Reason About?
What Can Neural Networks Reason About?
Keyulu Xu
Jingling Li
Mozhi Zhang
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
NAIAI4CE
110
248
0
30 May 2019
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep
  Neural Networks
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Yuan Cao
Quanquan Gu
MLTAI4CE
131
392
0
30 May 2019
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph
  Kernels
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels
S. Du
Kangcheng Hou
Barnabás Póczós
Ruslan Salakhutdinov
Ruosong Wang
Keyulu Xu
142
276
0
30 May 2019
Generalization bounds for deep convolutional neural networks
Generalization bounds for deep convolutional neural networks
Philip M. Long
Hanie Sedghi
MLT
136
90
0
29 May 2019
Norm-based generalisation bounds for multi-class convolutional neural
  networks
Norm-based generalisation bounds for multi-class convolutional neural networks
Antoine Ledent
Waleed Mustafa
Yunwen Lei
Marius Kloft
66
5
0
29 May 2019
On the Inductive Bias of Neural Tangent Kernels
On the Inductive Bias of Neural Tangent Kernels
A. Bietti
Julien Mairal
128
260
0
29 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for
  Regression Problems
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
76
57
0
28 May 2019
Simple and Effective Regularization Methods for Training on Noisily
  Labeled Data with Generalization Guarantee
Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee
Wei Hu
Zhiyuan Li
Dingli Yu
NoLa
113
12
0
27 May 2019
Fast Convergence of Natural Gradient Descent for Overparameterized
  Neural Networks
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
Guodong Zhang
James Martens
Roger C. Grosse
ODL
113
126
0
27 May 2019
Temporal-difference learning with nonlinear function approximation: lazy
  training and mean field regimes
Temporal-difference learning with nonlinear function approximation: lazy training and mean field regimes
Andrea Agazzi
Jianfeng Lu
98
8
0
27 May 2019
On Learning Over-parameterized Neural Networks: A Functional
  Approximation Perspective
On Learning Over-parameterized Neural Networks: A Functional Approximation Perspective
Lili Su
Pengkun Yang
MLT
80
54
0
26 May 2019
Enhancing Adversarial Defense by k-Winners-Take-All
Enhancing Adversarial Defense by k-Winners-Take-All
Chang Xiao
Peilin Zhong
Changxi Zheng
AAML
80
99
0
25 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
418
183
0
24 May 2019
On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural
  Networks
On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural Networks
Ting Yu
Junzhao Zhang
Zhanxing Zhu
MLT
44
5
0
24 May 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural
  Networks on Classification Problems
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
Atsushi Nitanda
Geoffrey Chinot
Taiji Suzuki
MLT
105
34
0
23 May 2019
A type of generalization error induced by initialization in deep neural
  networks
A type of generalization error induced by initialization in deep neural networks
Yaoyu Zhang
Zhi-Qin John Xu
Yaoyu Zhang
Zheng Ma
128
51
0
19 May 2019
An Essay on Optimization Mystery of Deep Learning
An Essay on Optimization Mystery of Deep Learning
Eugene Golikov
ODL
30
0
0
17 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz
  Augmentation
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Colin Wei
Tengyu Ma
87
110
0
09 May 2019
Rethinking Arithmetic for Deep Neural Networks
Rethinking Arithmetic for Deep Neural Networks
George A. Constantinides
64
4
0
07 May 2019
Linearized two-layers neural networks in high dimension
Linearized two-layers neural networks in high dimension
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
97
243
0
27 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
294
928
0
26 Apr 2019
The Impact of Neural Network Overparameterization on Gradient Confusion
  and Stochastic Gradient Descent
The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent
Karthik A. Sankararaman
Soham De
Zheng Xu
Wenjie Huang
Tom Goldstein
ODL
120
106
0
15 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network
  Model with Skip-connections
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
108
22
0
10 Apr 2019
A Comparative Analysis of the Optimization and Generalization Property
  of Two-layer Neural Network and Random Feature Models Under Gradient Descent
  Dynamics
A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics
E. Weinan
Chao Ma
Lei Wu
MLT
85
124
0
08 Apr 2019
Correlation Congruence for Knowledge Distillation
Correlation Congruence for Knowledge Distillation
Baoyun Peng
Xiao Jin
Jiaheng Liu
Shunfeng Zhou
Yichao Wu
Yu Liu
Dongsheng Li
Zhaoning Zhang
100
515
0
03 Apr 2019
Convergence rates for the stochastic gradient descent method for
  non-convex objective functions
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Benjamin J. Fehrman
Benjamin Gess
Arnulf Jentzen
98
101
0
02 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural
  Networks
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
125
182
0
01 Apr 2019
On the Stability and Generalization of Learning with Kernel Activation
  Functions
On the Stability and Generalization of Learning with Kernel Activation Functions
M. Cirillo
Simone Scardapane
S. Van Vaerenbergh
A. Uncini
20
0
0
28 Mar 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise
  for Overparameterized Neural Networks
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
140
356
0
27 Mar 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Trevor Hastie
Andrea Montanari
Saharon Rosset
Robert Tibshirani
302
747
0
19 Mar 2019
Stabilize Deep ResNet with A Sharp Scaling Factor $τ$
Stabilize Deep ResNet with A Sharp Scaling Factor τττ
Huishuai Zhang
Da Yu
Mingyang Yi
Wei Chen
Tie-Yan Liu
57
9
0
17 Mar 2019
Theory III: Dynamics and Generalization in Deep Networks
Theory III: Dynamics and Generalization in Deep Networks
Andrzej Banburski
Q. Liao
Alycia Lee
Lorenzo Rosasco
Fernanda De La Torre
Jack Hidary
T. Poggio
AI4CE
74
3
0
12 Mar 2019
Mean Field Analysis of Deep Neural Networks
Mean Field Analysis of Deep Neural Networks
Justin A. Sirignano
K. Spiliopoulos
109
82
0
11 Mar 2019
A Priori Estimates of the Population Risk for Residual Networks
A Priori Estimates of the Population Risk for Residual Networks
E. Weinan
Chao Ma
Qingcan Wang
UQCV
103
61
0
06 Mar 2019
Why Learning of Large-Scale Neural Networks Behaves Like Convex
  Optimization
Why Learning of Large-Scale Neural Networks Behaves Like Convex Optimization
Hui Jiang
28
8
0
06 Mar 2019
Implicit Regularization in Over-parameterized Neural Networks
Implicit Regularization in Over-parameterized Neural Networks
M. Kubo
Ryotaro Banno
Hidetaka Manabe
Masataka Minoji
88
23
0
05 Mar 2019
Stabilizing the Lottery Ticket Hypothesis
Stabilizing the Lottery Ticket Hypothesis
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
88
103
0
05 Mar 2019
LipschitzLR: Using theoretically computed adaptive learning rates for
  fast convergence
LipschitzLR: Using theoretically computed adaptive learning rates for fast convergence
Rahul Yedida
Snehanshu Saha
Tejas Prashanth
ODL
53
12
0
20 Feb 2019
Global Convergence of Adaptive Gradient Methods for An
  Over-parameterized Neural Network
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network
Xiaoxia Wu
S. Du
Rachel A. Ward
103
66
0
19 Feb 2019
Mean-field theory of two-layers neural networks: dimension-free bounds
  and kernel limit
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
90
280
0
16 Feb 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian
  Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Greg Yang
209
289
0
13 Feb 2019
Identity Crisis: Memorization and Generalization under Extreme
  Overparameterization
Identity Crisis: Memorization and Generalization under Extreme Overparameterization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Michael C. Mozer
Y. Singer
60
90
0
13 Feb 2019
Towards moderate overparameterization: global convergence guarantees for
  training shallow neural networks
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks
Samet Oymak
Mahdi Soltanolkotabi
79
323
0
12 Feb 2019
Understanding over-parameterized deep networks by geometrization
Understanding over-parameterized deep networks by geometrization
Xiao Dong
Ling Zhou
GNNAI4CE
45
7
0
11 Feb 2019
Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks
Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks
Phan-Minh Nguyen
AI4CE
82
72
0
07 Feb 2019
Are All Layers Created Equal?
Are All Layers Created Equal?
Chiyuan Zhang
Samy Bengio
Y. Singer
111
140
0
06 Feb 2019
Generalization Error Bounds of Gradient Descent for Learning
  Over-parameterized Deep ReLU Networks
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Yuan Cao
Quanquan Gu
ODLMLTAI4CE
156
158
0
04 Feb 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
152
94
0
28 Jan 2019
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs
D. Gilboa
B. Chang
Minmin Chen
Greg Yang
S. Schoenholz
Ed H. Chi
Jeffrey Pennington
86
42
0
25 Jan 2019
Previous
123...161718
Next