ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03804
  4. Cited By
Gradient Descent Finds Global Minima of Deep Neural Networks

Gradient Descent Finds Global Minima of Deep Neural Networks

9 November 2018
S. Du
J. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
    ODL
ArXivPDFHTML

Papers citing "Gradient Descent Finds Global Minima of Deep Neural Networks"

50 / 763 papers shown
Title
Fast Convergence of Natural Gradient Descent for Overparameterized
  Neural Networks
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
Guodong Zhang
James Martens
Roger C. Grosse
ODL
22
124
0
27 May 2019
Temporal-difference learning with nonlinear function approximation: lazy
  training and mean field regimes
Temporal-difference learning with nonlinear function approximation: lazy training and mean field regimes
Andrea Agazzi
Jianfeng Lu
11
8
0
27 May 2019
On Learning Over-parameterized Neural Networks: A Functional
  Approximation Perspective
On Learning Over-parameterized Neural Networks: A Functional Approximation Perspective
Lili Su
Pengkun Yang
MLT
21
53
0
26 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural
  Networks
On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural Networks
Ting Yu
Junzhao Zhang
Zhanxing Zhu
MLT
11
5
0
24 May 2019
How degenerate is the parametrization of neural networks with the ReLU
  activation function?
How degenerate is the parametrization of neural networks with the ReLU activation function?
Julius Berner
Dennis Elbrächter
Philipp Grohs
ODL
27
28
0
23 May 2019
Exploring Structural Sparsity of Deep Networks via Inverse Scale Spaces
Exploring Structural Sparsity of Deep Networks via Inverse Scale Spaces
Yanwei Fu
Chen Liu
Donghao Li
Zuyuan Zhong
Xinwei Sun
Jinshan Zeng
Yuan Yao
30
10
0
23 May 2019
Budgeted Training: Rethinking Deep Neural Network Training Under
  Resource Constraints
Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints
Mengtian Li
Ersin Yumer
Deva Ramanan
14
46
0
12 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz
  Augmentation
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Colin Wei
Tengyu Ma
17
109
0
09 May 2019
Linearized two-layers neural networks in high dimension
Linearized two-layers neural networks in high dimension
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
18
241
0
27 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
42
901
0
26 Apr 2019
A Selective Overview of Deep Learning
A Selective Overview of Deep Learning
Jianqing Fan
Cong Ma
Yiqiao Zhong
BDL
VLM
33
136
0
10 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network
  Model with Skip-connections
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
32
22
0
10 Apr 2019
A Comparative Analysis of the Optimization and Generalization Property
  of Two-layer Neural Network and Random Feature Models Under Gradient Descent
  Dynamics
A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics
E. Weinan
Chao Ma
Lei Wu
MLT
16
121
0
08 Apr 2019
Stokes Inversion based on Convolutional Neural Networks
Stokes Inversion based on Convolutional Neural Networks
A. Ramos
Institute for Solar Physics
25
45
0
07 Apr 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model
  in Non-convex Machine Learning
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning
Kenji Kawaguchi
Jiaoyang Huang
L. Kaelbling
AAML
21
18
0
07 Apr 2019
Convergence rates for the stochastic gradient descent method for
  non-convex objective functions
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Benjamin J. Fehrman
Benjamin Gess
Arnulf Jentzen
19
101
0
02 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural
  Networks
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
23
181
0
01 Apr 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise
  for Overparameterized Neural Networks
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
47
351
0
27 Mar 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Trevor Hastie
Andrea Montanari
Saharon Rosset
R. Tibshirani
31
728
0
19 Mar 2019
Stabilize Deep ResNet with A Sharp Scaling Factor $τ$
Stabilize Deep ResNet with A Sharp Scaling Factor τττ
Huishuai Zhang
Da Yu
Mingyang Yi
Wei Chen
Tie-Yan Liu
26
8
0
17 Mar 2019
Theory III: Dynamics and Generalization in Deep Networks
Theory III: Dynamics and Generalization in Deep Networks
Andrzej Banburski
Q. Liao
Brando Miranda
Lorenzo Rosasco
Fernanda De La Torre
Jack Hidary
T. Poggio
AI4CE
29
3
0
12 Mar 2019
Mean Field Analysis of Deep Neural Networks
Mean Field Analysis of Deep Neural Networks
Justin A. Sirignano
K. Spiliopoulos
13
82
0
11 Mar 2019
A Priori Estimates of the Population Risk for Residual Networks
A Priori Estimates of the Population Risk for Residual Networks
E. Weinan
Chao Ma
Qingcan Wang
UQCV
37
61
0
06 Mar 2019
Why Learning of Large-Scale Neural Networks Behaves Like Convex
  Optimization
Why Learning of Large-Scale Neural Networks Behaves Like Convex Optimization
Hui Jiang
16
8
0
06 Mar 2019
Solving a Class of Non-Convex Min-Max Games Using Iterative First Order
  Methods
Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods
Maher Nouiehed
Maziar Sanjabi
Tianjian Huang
J. Lee
Meisam Razaviyayn
37
337
0
21 Feb 2019
Global Convergence of Adaptive Gradient Methods for An
  Over-parameterized Neural Network
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network
Xiaoxia Wu
S. Du
Rachel A. Ward
11
66
0
19 Feb 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
  Descent
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
16
1,076
0
18 Feb 2019
Mean-field theory of two-layers neural networks: dimension-free bounds
  and kernel limit
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
33
276
0
16 Feb 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian
  Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Greg Yang
11
283
0
13 Feb 2019
Identity Crisis: Memorization and Generalization under Extreme
  Overparameterization
Identity Crisis: Memorization and Generalization under Extreme Overparameterization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Michael C. Mozer
Y. Singer
8
87
0
13 Feb 2019
Towards moderate overparameterization: global convergence guarantees for
  training shallow neural networks
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks
Samet Oymak
Mahdi Soltanolkotabi
19
318
0
12 Feb 2019
Combining learning rate decay and weight decay with complexity gradient
  descent - Part I
Combining learning rate decay and weight decay with complexity gradient descent - Part I
Pierre Harvey Richemond
Yike Guo
25
4
0
07 Feb 2019
Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks
Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks
Phan-Minh Nguyen
AI4CE
24
72
0
07 Feb 2019
Are All Layers Created Equal?
Are All Layers Created Equal?
Chiyuan Zhang
Samy Bengio
Y. Singer
20
140
0
06 Feb 2019
Generalization Error Bounds of Gradient Descent for Learning
  Over-parameterized Deep ReLU Networks
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Yuan Cao
Quanquan Gu
ODL
MLT
AI4CE
22
155
0
04 Feb 2019
On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex
  Learning
On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning
Jian Li
Xuanyuan Luo
Mingda Qiao
19
85
0
02 Feb 2019
Depth creates no more spurious local minima
Depth creates no more spurious local minima
Li Zhang
10
19
0
28 Jan 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
19
94
0
28 Jan 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
37
961
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
21
94
0
24 Jan 2019
On Connected Sublevel Sets in Deep Learning
On Connected Sublevel Sets in Deep Learning
Quynh N. Nguyen
16
102
0
22 Jan 2019
Elimination of All Bad Local Minima in Deep Learning
Elimination of All Bad Local Minima in Deep Learning
Kenji Kawaguchi
L. Kaelbling
14
44
0
02 Jan 2019
A Theoretical Analysis of Deep Q-Learning
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan
Zhuoran Yang
Yuchen Xie
Zhaoran Wang
17
595
0
01 Jan 2019
On the Benefit of Width for Neural Networks: Disappearance of Bad Basins
On the Benefit of Width for Neural Networks: Disappearance of Bad Basins
Dawei Li
Tian Ding
Ruoyu Sun
29
37
0
28 Dec 2018
Overparameterized Nonlinear Learning: Gradient Descent Takes the
  Shortest Path?
Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?
Samet Oymak
Mahdi Soltanolkotabi
ODL
6
176
0
25 Dec 2018
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
28
806
0
19 Dec 2018
A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks
A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks
Jinghui Chen
Dongruo Zhou
Jinfeng Yi
Quanquan Gu
AAML
15
67
0
27 Nov 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
30
446
0
21 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
19
1,447
0
09 Nov 2018
Previous
123...141516
Next