ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
Overparameterized Neural Networks Implement Associative Memory
Overparameterized Neural Networks Implement Associative Memory
Adityanarayanan Radhakrishnan
M. Belkin
Caroline Uhler
BDL
99
74
0
26 Sep 2019
Polylogarithmic width suffices for gradient descent to achieve
  arbitrarily small test error with shallow ReLU networks
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
Ziwei Ji
Matus Telgarsky
98
178
0
26 Sep 2019
Towards Understanding the Transferability of Deep Representations
Towards Understanding the Transferability of Deep Representations
Hong Liu
Mingsheng Long
Jianmin Wang
Michael I. Jordan
66
26
0
26 Sep 2019
Mildly Overparametrized Neural Nets can Memorize Training Data
  Efficiently
Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently
Rong Ge
Runzhe Wang
Haoyu Zhao
TDI
60
20
0
26 Sep 2019
Wider Networks Learn Better Features
Wider Networks Learn Better Features
D. Gilboa
Guy Gur-Ari
46
7
0
25 Sep 2019
Asymptotics of Wide Networks from Feynman Diagrams
Asymptotics of Wide Networks from Feynman Diagrams
Ethan Dyer
Guy Gur-Ari
112
115
0
25 Sep 2019
Classification Logit Two-sample Testing by Neural Networks
Classification Logit Two-sample Testing by Neural Networks
Xiuyuan Cheng
A. Cloninger
89
33
0
25 Sep 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
121
89
0
18 Sep 2019
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
Jiaoyang Huang
H. Yau
62
151
0
18 Sep 2019
Neural Policy Gradient Methods: Global Optimality and Rates of
  Convergence
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
113
242
0
29 Aug 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems
  Perspective
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective
Guan-Horng Liu
Evangelos A. Theodorou
AI4CE
121
72
0
28 Aug 2019
Stochastic AUC Maximization with Deep Neural Networks
Stochastic AUC Maximization with Deep Neural Networks
Mingrui Liu
Zhuoning Yuan
Yiming Ying
Tianbao Yang
59
109
0
28 Aug 2019
Linear Convergence of Adaptive Stochastic Gradient Descent
Linear Convergence of Adaptive Stochastic Gradient Descent
Yuege Xie
Xiaoxia Wu
Rachel A. Ward
74
45
0
28 Aug 2019
On the Multiple Descent of Minimum-Norm Interpolants and Restricted
  Lower Isometry of Kernels
On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels
Tengyuan Liang
Alexander Rakhlin
Xiyu Zhai
103
29
0
27 Aug 2019
Adaptative Inference Cost With Convolutional Neural Mixture Models
Adaptative Inference Cost With Convolutional Neural Mixture Models
Adria Ruiz
Jakob Verbeek
VLM
65
22
0
19 Aug 2019
Effect of Activation Functions on the Training of Overparametrized
  Neural Nets
Effect of Activation Functions on the Training of Overparametrized Neural Nets
A. Panigrahi
Abhishek Shetty
Navin Goyal
81
21
0
16 Aug 2019
The generalization error of random features regression: Precise
  asymptotics and double descent curve
The generalization error of random features regression: Precise asymptotics and double descent curve
Song Mei
Andrea Montanari
162
640
0
14 Aug 2019
Gradient Descent Finds Global Minima for Generalizable Deep Neural
  Networks of Practical Sizes
Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes
Kenji Kawaguchi
Jiaoyang Huang
ODL
85
57
0
05 Aug 2019
Path Length Bounds for Gradient Descent and Flow
Path Length Bounds for Gradient Descent and Flow
Chirag Gupta
Sivaraman Balakrishnan
Aaditya Ramdas
152
15
0
02 Aug 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
93
52
0
24 Jul 2019
A Fine-Grained Spectral Perspective on Neural Networks
A Fine-Grained Spectral Perspective on Neural Networks
Greg Yang
Hadi Salman
125
113
0
24 Jul 2019
Sparse Optimization on Measures with Over-parameterized Gradient Descent
Sparse Optimization on Measures with Over-parameterized Gradient Descent
Lénaïc Chizat
97
93
0
24 Jul 2019
Trainability of ReLU networks and Data-dependent Initialization
Trainability of ReLU networks and Data-dependent Initialization
Yeonjong Shin
George Karniadakis
59
8
0
23 Jul 2019
Surfing: Iterative optimization over incrementally trained deep networks
Surfing: Iterative optimization over incrementally trained deep networks
Ganlin Song
Z. Fan
John D. Lafferty
73
20
0
19 Jul 2019
Order and Chaos: NTK views on DNN Normalization, Checkerboard and
  Boundary Artifacts
Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts
Arthur Jacot
Franck Gabriel
François Ged
Clément Hongler
67
24
0
11 Jul 2019
Towards Explaining the Regularization Effect of Initial Large Learning
  Rate in Training Neural Networks
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
90
300
0
10 Jul 2019
Two-block vs. Multi-block ADMM: An empirical evaluation of convergence
Two-block vs. Multi-block ADMM: An empirical evaluation of convergence
A. Gonçalves
Xiaoliang Liu
A. Banerjee
33
5
0
10 Jul 2019
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a
  Noisy Quadratic Model
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Guodong Zhang
Lala Li
Zachary Nado
James Martens
Sushant Sachdeva
George E. Dahl
Christopher J. Shallue
Roger C. Grosse
126
154
0
09 Jul 2019
Scaling Limit of Neural Networks with the Xavier Initialization and
  Convergence to a Global Minimum
Scaling Limit of Neural Networks with the Xavier Initialization and Convergence to a Global Minimum
Justin A. Sirignano
K. Spiliopoulos
58
14
0
09 Jul 2019
On Symmetry and Initialization for Neural Networks
On Symmetry and Initialization for Neural Networks
Ido Nachum
Amir Yehudayoff
MLT
59
5
0
01 Jul 2019
Benign Overfitting in Linear Regression
Benign Overfitting in Linear Regression
Peter L. Bartlett
Philip M. Long
Gábor Lugosi
Alexander Tsigler
MLT
126
780
0
26 Jun 2019
Compound Probabilistic Context-Free Grammars for Grammar Induction
Compound Probabilistic Context-Free Grammars for Grammar Induction
Yoon Kim
Chris Dyer
Alexander M. Rush
116
153
0
24 Jun 2019
Limitations of Lazy Training of Two-layers Neural Networks
Limitations of Lazy Training of Two-layers Neural Networks
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
62
143
0
21 Jun 2019
Algorithmic Guarantees for Inverse Imaging with Untrained Network Priors
Algorithmic Guarantees for Inverse Imaging with Untrained Network Priors
Gauri Jagatap
Chinmay Hegde
89
74
0
20 Jun 2019
ID3 Learns Juntas for Smoothed Product Distributions
ID3 Learns Juntas for Smoothed Product Distributions
Alon Brutzkus
Amit Daniely
Eran Malach
67
21
0
20 Jun 2019
Disentangling feature and lazy training in deep neural networks
Disentangling feature and lazy training in deep neural networks
Mario Geiger
S. Spigler
Arthur Jacot
Matthieu Wyart
127
17
0
19 Jun 2019
Convergence of Adversarial Training in Overparametrized Neural Networks
Convergence of Adversarial Training in Overparametrized Neural Networks
Ruiqi Gao
Tianle Cai
Haochuan Li
Liwei Wang
Cho-Jui Hsieh
Jason D. Lee
AAML
113
109
0
19 Jun 2019
Gradient Dynamics of Shallow Univariate ReLU Networks
Gradient Dynamics of Shallow Univariate ReLU Networks
Francis Williams
Matthew Trager
Claudio Silva
Daniele Panozzo
Denis Zorin
Joan Bruna
70
80
0
18 Jun 2019
Dynamics of stochastic gradient descent for two-layer neural networks in
  the teacher-student setup
Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Sebastian Goldt
Madhu S. Advani
Andrew M. Saxe
Florent Krzakala
Lenka Zdeborová
MLT
142
145
0
18 Jun 2019
Approximation power of random neural networks
Bolton Bailey
Ziwei Ji
Matus Telgarsky
Ruicheng Xian
55
6
0
18 Jun 2019
Distributed Optimization for Over-Parameterized Learning
Distributed Optimization for Over-Parameterized Learning
Chi Zhang
Qianxiao Li
80
5
0
14 Jun 2019
Kernel and Rich Regimes in Overparametrized Models
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
Jason D. Lee
Daniel Soudry
Nathan Srebro
91
367
0
13 Jun 2019
Generalization Guarantees for Neural Networks via Harnessing the
  Low-rank Structure of the Jacobian
Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian
Samet Oymak
Zalan Fabian
Mingchen Li
Mahdi Soltanolkotabi
MLT
91
89
0
12 Jun 2019
Decoupling Gating from Linearity
Decoupling Gating from Linearity
Jonathan Fiat
Eran Malach
Shai Shalev-Shwartz
128
28
0
12 Jun 2019
An Improved Analysis of Training Over-parameterized Deep Neural Networks
An Improved Analysis of Training Over-parameterized Deep Neural Networks
Difan Zou
Quanquan Gu
80
235
0
11 Jun 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Zhao Song
Xin Yang
75
91
0
09 Jun 2019
One ticket to win them all: generalizing lottery ticket initializations
  across datasets and optimizers
One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers
Ari S. Morcos
Haonan Yu
Michela Paganini
Yuandong Tian
83
229
0
06 Jun 2019
Playing the lottery with rewards and multiple languages: lottery tickets
  in RL and NLP
Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP
Haonan Yu
Sergey Edunov
Yuandong Tian
Ari S. Morcos
64
150
0
06 Jun 2019
The Convergence Rate of Neural Networks for Learned Functions of
  Different Frequencies
The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies
Ronen Basri
David Jacobs
Yoni Kasten
S. Kritchman
91
218
0
02 Jun 2019
A mean-field limit for certain deep neural networks
A mean-field limit for certain deep neural networks
Dyego Araújo
R. Oliveira
Daniel Yukimura
AI4CE
85
70
0
01 Jun 2019
Previous
123...15161718
Next