Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1908.02419
Cited By
v1
v2
v3 (latest)
Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes
Allerton Conference on Communication, Control, and Computing (Allerton), 2019
5 August 2019
Kenji Kawaguchi
Jiaoyang Huang
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes"
43 / 43 papers shown
From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions
Agnideep Aich
Ashit Aich
Bruce Wade
220
3
0
29 Jul 2025
Distributionally Robust Wireless Semantic Communication with Large AI Models
Long Tan Le
Senura Hansaja Wanasekara
Zerun Niu
Yansong Shi
Phuong Vo
Phuong Vo
Walid Saad
Zhu Han
Choong Seon Hong
Choong Seon Hong
223
1
0
28 May 2025
Statistically guided deep learning
Michael Kohler
A. Krzyżak
ODL
BDL
412
0
0
11 Apr 2025
Approximation and Gradient Descent Training with Neural Networks
G. Welper
295
2
0
19 May 2024
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
Journal of Statistical Planning and Inference (JSPI), 2024
Michael Kohler
A. Krzyżak
Benjamin Walter
262
1
0
13 May 2024
Analysis of the expected
L
2
L_2
L
2
error of an over-parametrized deep neural network estimate learned by gradient descent without regularization
Selina Drews
Michael Kohler
283
4
0
24 Nov 2023
Efficient Neural Networks for Tiny Machine Learning: A Comprehensive Review
M. Lê
Pierre Wolinski
Julyan Arbel
348
21
0
20 Nov 2023
Approximation Results for Gradient Descent trained Neural Networks
G. Welper
206
1
0
09 Sep 2023
Six Lectures on Linearized Neural Networks
Journal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023
Theodor Misiakiewicz
Andrea Montanari
392
18
0
25 Aug 2023
Global Convergence of SGD On Two Layer Neural Nets
Information and Inference A Journal of the IMA (JIII), 2022
Pulkit Gopalani
Anirbit Mukherjee
267
9
0
20 Oct 2022
Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent
IEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2022
Michael Kohler
A. Krzyżak
290
12
0
04 Oct 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
362
9
0
17 Sep 2022
On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent
Annals of the Institute of Statistical Mathematics (AISM), 2022
Selina Drews
Michael Kohler
253
19
0
30 Aug 2022
Robustness Implies Generalization via Data-Dependent Generalization Bounds
International Conference on Machine Learning (ICML), 2022
Kenji Kawaguchi
Zhun Deng
K. Luh
Jiaoyang Huang
OOD
543
28
0
27 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
International Conference on Machine Learning (ICML), 2022
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
307
17
0
26 Jun 2022
Parameter Convex Neural Networks
Jingcheng Zhou
Wei Wei
Xing Li
Bowen Pang
Zhiming Zheng
112
1
0
11 Jun 2022
Overcoming the Spectral Bias of Neural Value Approximation
International Conference on Learning Representations (ICLR), 2022
Ge Yang
Anurag Ajay
Pulkit Agrawal
367
30
0
09 Jun 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
404
3
0
28 Jan 2022
Learning Proximal Operators to Discover Multiple Optima
International Conference on Learning Representations (ICLR), 2022
Lingxiao Li
Noam Aigerman
Vladimir G. Kim
Jiajin Li
Kristjan Greenewald
Mikhail Yurochkin
Justin Solomon
395
3
0
28 Jan 2022
Complexity from Adaptive-Symmetries Breaking: Global Minima in the Statistical Mechanics of Deep Neural Networks
Shaun Li
AI4CE
256
1
0
03 Jan 2022
Subquadratic Overparameterization for Shallow Neural Networks
Neural Information Processing Systems (NeurIPS), 2021
Chaehwan Song
Ali Ramezani-Kebrya
Thomas Pethick
Armin Eftekhari
Volkan Cevher
220
34
0
02 Nov 2021
On the Double Descent of Random Features Models Trained with SGD
Fanghui Liu
Johan A. K. Suykens
Volkan Cevher
MLT
551
11
0
13 Oct 2021
Convergence rates for shallow neural networks learned by gradient descent
Alina Braun
Michael Kohler
S. Langer
Harro Walk
273
14
0
20 Jul 2021
Meta-learning PINN loss functions
Apostolos F. Psaros
Kenji Kawaguchi
George Karniadakis
PINN
228
141
0
12 Jul 2021
Learning distinct features helps, provably
Firas Laakom
Jenni Raitoharju
Alexandros Iosifidis
Moncef Gabbouj
MLT
295
6
0
10 Jun 2021
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
International Conference on Machine Learning (ICML), 2021
Keyulu Xu
Mozhi Zhang
Stefanie Jegelka
Kenji Kawaguchi
GNN
365
87
0
10 May 2021
A Recipe for Global Convergence Guarantee in Deep Neural Networks
AAAI Conference on Artificial Intelligence (AAAI), 2021
Kenji Kawaguchi
Qingyun Sun
323
14
0
12 Apr 2021
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers
International Conference on Learning Representations (ICLR), 2021
Kenji Kawaguchi
PINN
297
47
0
15 Feb 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
289
3
0
12 Jan 2021
End-to-end Kernel Learning via Generative Random Fourier Features
Pattern Recognition (Pattern Recognit.), 2020
Kun Fang
Fanghui Liu
Xiaolin Huang
Jie Yang
489
12
0
10 Sep 2020
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)
CSIAM Transactions on Applied Mathematics (CSIAM Trans. Appl. Math.), 2020
Yuqing Li
Yaoyu Zhang
N. Yip
344
5
0
07 Jul 2020
Network size and weights size for memorization with two-layers neural networks
Sébastien Bubeck
Ronen Eldan
Y. Lee
Dan Mikulincer
277
34
0
04 Jun 2020
Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm
Sayar Karmakar
Anirbit Mukherjee
461
7
0
08 May 2020
On the Global Convergence of Training Deep Linear ResNets
International Conference on Learning Representations (ICLR), 2020
Difan Zou
Philip M. Long
Quanquan Gu
240
43
0
02 Mar 2020
A Corrective View of Neural Networks: Representation, Memorization and Learning
Annual Conference Computational Learning Theory (COLT), 2020
Guy Bresler
Dheeraj M. Nagaraj
MLT
316
21
0
01 Feb 2020
Over-parametrized deep neural networks do not generalize well
Michael Kohler
A. Krzyżak
209
14
0
09 Dec 2019
Analysis of the rate of convergence of neural network regression estimates which are easy to implement
Alina Braun
Michael Kohler
A. Krzyżak
281
1
0
09 Dec 2019
On the rate of convergence of a neural network regression estimate learned by gradient descent
Alina Braun
Michael Kohler
Harro Walk
166
11
0
09 Dec 2019
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
International Conference on Learning Representations (ICLR), 2019
Zixiang Chen
Yuan Cao
Difan Zou
Quanquan Gu
403
132
0
27 Nov 2019
Quadratic number of nodes is sufficient to learn a dataset via gradient descent
Biswarup Das
Eugene Golikov
MLT
93
0
0
13 Nov 2019
Nearly Minimal Over-Parametrization of Shallow Neural Networks
Armin Eftekhari
Chaehwan Song
Volkan Cevher
226
1
0
09 Oct 2019
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
International Conference on Machine Learning (ICML), 2019
Jiaoyang Huang
H. Yau
217
166
0
18 Sep 2019
Elimination of All Bad Local Minima in Deep Learning
Kenji Kawaguchi
L. Kaelbling
425
48
0
02 Jan 2019
1
Page 1 of 1