v1v2v3 (latest)

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

Allerton Conference on Communication, Control, and Computing (Allerton), 2019

5 August 2019

Papers citing "Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes"

43 / 43 papers shown

From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions

Agnideep Aich

Ashit Aich

Bruce Wade

220

29 Jul 2025

Distributionally Robust Wireless Semantic Communication with Large AI Models

Long Tan Le

Senura Hansaja Wanasekara

223

28 May 2025

Statistically guided deep learning

Michael Kohler

A. Krzyżak

ODL BDL

412

11 Apr 2025

Approximation and Gradient Descent Training with Neural Networks

G. Welper

295

19 May 2024

Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descentJournal of Statistical Planning and Inference (JSPI), 2024

Michael Kohler

A. Krzyżak

Benjamin Walter

262

13 May 2024

Analysis of the expected

L_2

error of an over-parametrized deep neural network estimate learned by gradient descent without regularization

Selina Drews

Michael Kohler

283

24 Nov 2023

Efficient Neural Networks for Tiny Machine Learning: A Comprehensive Review

M. Lê

Pierre Wolinski

Julyan Arbel

348

20 Nov 2023

Approximation Results for Gradient Descent trained Neural Networks

G. Welper

206

09 Sep 2023

Six Lectures on Linearized Neural NetworksJournal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023

Theodor Misiakiewicz

Andrea Montanari

392

25 Aug 2023

Global Convergence of SGD On Two Layer Neural NetsInformation and Inference A Journal of the IMA (JIII), 2022

Pulkit Gopalani

Anirbit Mukherjee

267

20 Oct 2022

Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descentIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2022

Michael Kohler

A. Krzyżak

290

04 Oct 2022

Approximation results for Gradient Descent trained Shallow Neural Networks in

1d

R. Gentile

G. Welper

ODL

362

17 Sep 2022

On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descentAnnals of the Institute of Statistical Mathematics (AISM), 2022

Selina Drews

Michael Kohler

253

30 Aug 2022

Robustness Implies Generalization via Data-Dependent Generalization BoundsInternational Conference on Machine Learning (ICML), 2022

543

27 Jun 2022

Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case AnalysisInternational Conference on Machine Learning (ICML), 2022

307

26 Jun 2022

Parameter Convex Neural Networks

112

11 Jun 2022

Overcoming the Spectral Bias of Neural Value ApproximationInternational Conference on Learning Representations (ICLR), 2022

Ge Yang

Anurag Ajay

Pulkit Agrawal

367

09 Jun 2022

Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks

Bartlomiej Polaczyk

J. Cyranka

ODL

404

28 Jan 2022

Learning Proximal Operators to Discover Multiple OptimaInternational Conference on Learning Representations (ICLR), 2022

395

28 Jan 2022

Complexity from Adaptive-Symmetries Breaking: Global Minima in the Statistical Mechanics of Deep Neural Networks

Shaun Li

AI4CE

256

03 Jan 2022

Subquadratic Overparameterization for Shallow Neural NetworksNeural Information Processing Systems (NeurIPS), 2021

220

02 Nov 2021

On the Double Descent of Random Features Models Trained with SGD

551

13 Oct 2021

Convergence rates for shallow neural networks learned by gradient descent

273

20 Jul 2021

Meta-learning PINN loss functions

228

141

12 Jul 2021

Learning distinct features helps, provably

295

10 Jun 2021

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More DepthInternational Conference on Machine Learning (ICML), 2021

365

10 May 2021

A Recipe for Global Convergence Guarantee in Deep Neural NetworksAAAI Conference on Artificial Intelligence (AAAI), 2021

Kenji Kawaguchi

Qingyun Sun

323

12 Apr 2021

On the Theory of Implicit Deep Learning: Global Convergence with Implicit LayersInternational Conference on Learning Representations (ICLR), 2021

Kenji Kawaguchi

PINN

297

15 Feb 2021

A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks

289

12 Jan 2021

End-to-end Kernel Learning via Generative Random Fourier FeaturesPattern Recognition (Pattern Recognit.), 2020

Kun Fang

Fanghui Liu

Xiaolin Huang

Jie Yang

489

10 Sep 2020

Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)CSIAM Transactions on Applied Mathematics (CSIAM Trans. Appl. Math.), 2020

Yuqing Li

Yaoyu Zhang

N. Yip

344

07 Jul 2020

Network size and weights size for memorization with two-layers neural networks

277

04 Jun 2020

Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm

Sayar Karmakar

Anirbit Mukherjee

461

08 May 2020

On the Global Convergence of Training Deep Linear ResNetsInternational Conference on Learning Representations (ICLR), 2020

Difan Zou

Philip M. Long

Quanquan Gu

240

02 Mar 2020

A Corrective View of Neural Networks: Representation, Memorization and LearningAnnual Conference Computational Learning Theory (COLT), 2020

Guy Bresler

Dheeraj M. Nagaraj

MLT

316

01 Feb 2020

Over-parametrized deep neural networks do not generalize well

Michael Kohler

A. Krzyżak

209

09 Dec 2019

Analysis of the rate of convergence of neural network regression estimates which are easy to implement

Alina Braun

Michael Kohler

A. Krzyżak

281

09 Dec 2019

On the rate of convergence of a neural network regression estimate learned by gradient descent

Alina Braun

Michael Kohler

Harro Walk

166

09 Dec 2019

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?International Conference on Learning Representations (ICLR), 2019

Zixiang Chen

Yuan Cao

Difan Zou

Quanquan Gu

403

132

27 Nov 2019

Quadratic number of nodes is sufficient to learn a dataset via gradient descent

Biswarup Das

Eugene Golikov

MLT

13 Nov 2019

Nearly Minimal Over-Parametrization of Shallow Neural Networks

Armin Eftekhari

Chaehwan Song

Volkan Cevher

226

09 Oct 2019

Dynamics of Deep Neural Networks and Neural Tangent HierarchyInternational Conference on Machine Learning (ICML), 2019

Jiaoyang Huang

H. Yau

217

166

18 Sep 2019

Elimination of All Bad Local Minima in Deep Learning

Kenji Kawaguchi

L. Kaelbling

425

02 Jan 2019