ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03804
  4. Cited By
Gradient Descent Finds Global Minima of Deep Neural Networks
v1v2v3v4 (latest)

Gradient Descent Finds Global Minima of Deep Neural Networks

9 November 2018
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
    ODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Finds Global Minima of Deep Neural Networks"

50 / 466 papers shown
Title
On Function Approximation in Reinforcement Learning: Optimism in the
  Face of Large State Spaces
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
97
18
0
09 Nov 2020
Which Minimizer Does My Neural Network Converge To?
Which Minimizer Does My Neural Network Converge To?
Manuel Nonnenmacher
David Reeb
Ingo Steinwart
ODL
32
4
0
04 Nov 2020
DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural
  Networks
DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks
Shiyun Xu
Zhiqi Bu
15
1
0
01 Nov 2020
Deep learning versus kernel learning: an empirical study of loss
  landscape geometry and the time evolution of the Neural Tangent Kernel
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Stanislav Fort
Gintare Karolina Dziugaite
Mansheej Paul
Sepideh Kharaghani
Daniel M. Roy
Surya Ganguli
114
193
0
28 Oct 2020
Wearing a MASK: Compressed Representations of Variable-Length Sequences
  Using Recurrent Neural Tangent Kernels
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels
Sina Alemohammad
Hossein Babaei
Randall Balestriero
Matt Y. Cheung
Ahmed Imtiaz Humayun
...
Naiming Liu
Lorenzo Luzi
Jasper Tan
Zichao Wang
Richard G. Baraniuk
27
5
0
27 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural
  Networks
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
61
18
0
25 Oct 2020
Global optimality of softmax policy gradient with single hidden layer
  neural networks in the mean-field regime
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime
Andrea Agazzi
Jianfeng Lu
84
16
0
22 Oct 2020
Beyond Lazy Training for Over-parameterized Tensor Decomposition
Beyond Lazy Training for Over-parameterized Tensor Decomposition
Xiang Wang
Chenwei Wu
Jason D. Lee
Tengyu Ma
Rong Ge
91
14
0
22 Oct 2020
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data
  Efficiency and Imperfect Teacher
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Guangda Ji
Zhanxing Zhu
102
44
0
20 Oct 2020
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for
  Intelligent Vehicular Systems and Smart Cities
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for Intelligent Vehicular Systems and Smart Cities
A. Nassar
Y. Yilmaz
AI4CE
42
60
0
19 Oct 2020
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM
  in Deep Learning
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou
Jiashi Feng
Chao Ma
Caiming Xiong
Guosheng Lin
E. Weinan
104
235
0
12 Oct 2020
Constraining Logits by Bounded Function for Adversarial Robustness
Constraining Logits by Bounded Function for Adversarial Robustness
Sekitoshi Kanai
Masanori Yamada
Shin'ya Yamaguchi
Hiroshi Takahashi
Yasutoshi Ida
AAML
28
4
0
06 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
125
82
0
06 Oct 2020
WeMix: How to Better Utilize Data Augmentation
WeMix: How to Better Utilize Data Augmentation
Yi Tian Xu
Asaf Noy
Ming Lin
Qi Qian
Hao Li
Rong Jin
84
16
0
03 Oct 2020
On the linearity of large non-linear models: when and why the tangent
  kernel is constant
On the linearity of large non-linear models: when and why the tangent kernel is constant
Chaoyue Liu
Libin Zhu
M. Belkin
169
143
0
02 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
110
90
0
30 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural
  Networks
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
184
313
0
24 Sep 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
Jingtong Su
Yihang Chen
Tianle Cai
Tianhao Wu
Ruiqi Gao
Liwei Wang
Jason D. Lee
73
86
0
22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Lin Chen
Sheng Xu
193
94
0
22 Sep 2020
Kernel-Based Smoothness Analysis of Residual Networks
Kernel-Based Smoothness Analysis of Residual Networks
Tom Tirer
Joan Bruna
Raja Giryes
82
20
0
21 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
Generalized Leverage Score Sampling for Neural Networks
Jason D. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
66
43
0
21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network
  Training
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
104
167
0
07 Sep 2020
It's Hard for Neural Networks To Learn the Game of Life
It's Hard for Neural Networks To Learn the Game of Life
Jacob Mitchell Springer
Garrett Kenyon
85
21
0
03 Sep 2020
Predicting Training Time Without Training
Predicting Training Time Without Training
Luca Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
156
24
0
28 Aug 2020
A Dynamical Central Limit Theorem for Shallow Neural Networks
A Dynamical Central Limit Theorem for Shallow Neural Networks
Zhengdao Chen
Grant M. Rotskoff
Joan Bruna
Eric Vanden-Eijnden
87
30
0
21 Aug 2020
Asymptotics of Wide Convolutional Neural Networks
Asymptotics of Wide Convolutional Neural Networks
Anders Andreassen
Ethan Dyer
74
23
0
19 Aug 2020
The Neural Tangent Kernel in High Dimensions: Triple Descent and a
  Multi-Scale Theory of Generalization
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization
Ben Adlam
Jeffrey Pennington
61
125
0
15 Aug 2020
On the Generalization Properties of Adversarial Training
On the Generalization Properties of Adversarial Training
Yue Xing
Qifan Song
Guang Cheng
AAML
78
34
0
15 Aug 2020
Adversarial Training and Provable Robustness: A Tale of Two Objectives
Adversarial Training and Provable Robustness: A Tale of Two Objectives
Jiameng Fan
Wenchao Li
AAML
51
21
0
13 Aug 2020
Multiple Descent: Design Your Own Generalization Curve
Multiple Descent: Design Your Own Generalization Curve
Lin Chen
Yifei Min
M. Belkin
Amin Karbasi
DRL
162
61
0
03 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
87
43
0
02 Aug 2020
Finite Versus Infinite Neural Networks: an Empirical Study
Finite Versus Infinite Neural Networks: an Empirical Study
Jaehoon Lee
S. Schoenholz
Jeffrey Pennington
Ben Adlam
Lechao Xiao
Roman Novak
Jascha Narain Sohl-Dickstein
87
214
0
31 Jul 2020
On the Banach spaces associated with multi-layer ReLU networks: Function
  representation, approximation theory and gradient descent dynamics
On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics
E. Weinan
Stephan Wojtowytsch
MLT
67
53
0
30 Jul 2020
Universality of Gradient Descent Neural Network Training
Universality of Gradient Descent Neural Network Training
G. Welper
62
8
0
27 Jul 2020
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
Reinhard Heckel
Fatih Yilmaz
80
45
0
20 Jul 2020
Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions
Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions
Sinong Geng
Houssam Nassif
Carlos A. Manzanares
A. M. Reppen
R. Sircar
43
2
0
15 Jul 2020
From Symmetry to Geometry: Tractable Nonconvex Problems
From Symmetry to Geometry: Tractable Nonconvex Problems
Yuqian Zhang
Qing Qu
John N. Wright
87
45
0
14 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs
  Training Accuracy
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
Jason D. Lee
Nathan Srebro
Daniel Soudry
89
86
0
13 Jul 2020
Maximum-and-Concatenation Networks
Maximum-and-Concatenation Networks
Xingyu Xie
Hao Kong
Jianlong Wu
Wayne Zhang
Guangcan Liu
Zhouchen Lin
157
2
0
09 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
95
27
0
09 Jul 2020
Towards an Understanding of Residual Networks Using Neural Tangent
  Hierarchy (NTH)
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)
Yuqing Li
Yaoyu Zhang
N. Yip
55
5
0
07 Jul 2020
DessiLBI: Exploring Structural Sparsity of Deep Networks via
  Differential Inclusion Paths
DessiLBI: Exploring Structural Sparsity of Deep Networks via Differential Inclusion Paths
Yanwei Fu
Chen Liu
Donghao Li
Xinwei Sun
Jinshan Zeng
Yuan Yao
34
9
0
04 Jul 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
Yibo Jiang
Cengiz Pehlevan
61
14
0
30 Jun 2020
Two-Layer Neural Networks for Partial Differential Equations:
  Optimization and Generalization Theory
Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory
Yaoyu Zhang
Haizhao Yang
76
75
0
28 Jun 2020
Global Convergence and Generalization Bound of Gradient-Based
  Meta-Learning with Deep Neural Nets
Global Convergence and Generalization Bound of Gradient-Based Meta-Learning with Deep Neural Nets
Haoxiang Wang
Ruoyu Sun
Bo Li
MLTAI4CE
82
14
0
25 Jun 2020
The Quenching-Activation Behavior of the Gradient Descent Dynamics for
  Two-layer Neural Network Models
The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models
Chao Ma
Lei Wu
E. Weinan
MLT
121
11
0
25 Jun 2020
Towards Understanding Hierarchical Learning: Benefits of Neural
  Representations
Towards Understanding Hierarchical Learning: Benefits of Neural Representations
Minshuo Chen
Yu Bai
Jason D. Lee
T. Zhao
Huan Wang
Caiming Xiong
R. Socher
SSL
91
49
0
24 Jun 2020
When Do Neural Networks Outperform Kernel Methods?
When Do Neural Networks Outperform Kernel Methods?
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
128
189
0
24 Jun 2020
On the Global Optimality of Model-Agnostic Meta-Learning
On the Global Optimality of Model-Agnostic Meta-Learning
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
71
44
0
23 Jun 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural
  Tangent Kernel Regime
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda
Taiji Suzuki
79
41
0
22 Jun 2020
Previous
123...1056789
Next