Towards moderate overparameterization: global convergence guarantees for training shallow neural networks

12 February 2019

Papers citing "Towards moderate overparameterization: global convergence guarantees for training shallow neural networks"

45 / 45 papers shown

Title
Reparameterization invariance in approximate Bayesian inference Hrittik Roy M. Miani Carl Henrik Ek Philipp Hennig Marvin Pfortner Lukas Tatzel Søren Hauberg BDL 92 9 0 05 Jun 2024
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss T. Getu Georges Kaddoum M. Bennis 76 1 0 13 Sep 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 115 2 0 02 Feb 2023
Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian Samet Oymak Zalan Fabian Mingchen Li Mahdi Soltanolkotabi MLT 66 88 0 12 Jun 2019
An Improved Analysis of Training Over-parameterized Deep Neural Networks Difan Zou Quanquan Gu 63 235 0 11 Jun 2019
On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang 238 928 0 26 Apr 2019
Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path? Samet Oymak Mahdi Soltanolkotabi ODL 55 177 0 25 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks Difan Zou Yuan Cao Dongruo Zhou Quanquan Gu ODL 198 448 0 21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers Zeyuan Allen-Zhu Yuanzhi Li Yingyu Liang MLT 201 775 0 12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization Zeyuan Allen-Zhu Yuanzhi Li Zhao Song AI4CE ODL 266 1,469 0 09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks S. Du Jason D. Lee Haochuan Li Liwei Wang Masayoshi Tomizuka ODL 229 1,136 0 09 Nov 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 233 1,276 0 04 Oct 2018
Gradient descent aligns the layers of deep linear networks Ziwei Ji Matus Telgarsky 123 257 0 04 Oct 2018
Stochastic Gradient Descent Learns State Equations with Nonlinear Activations Samet Oymak 56 43 0 09 Sep 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem Justin A. Sirignano K. Spiliopoulos MLT 77 194 0 28 Aug 2018
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data Yuanzhi Li Yingyu Liang MLT 219 653 0 03 Aug 2018
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize Tengyuan Liang Alexander Rakhlin 81 355 0 01 Aug 2018
Does data interpolation contradict statistical optimality? M. Belkin Alexander Rakhlin Alexandre B. Tsybakov 88 220 0 25 Jun 2018
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate M. Belkin Daniel J. Hsu P. Mitra AI4CE 149 259 0 13 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport Lénaïc Chizat Francis R. Bach OT 214 737 0 24 May 2018
The Global Optimization Geometry of Shallow Linear Neural Networks Zhihui Zhu Daniel Soudry Yonina C. Eldar M. Wakin ODL 73 36 0 13 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks Song Mei Andrea Montanari Phan-Minh Nguyen MLT 105 862 0 18 Apr 2018
On the Local Minima of the Empirical Risk Chi Jin Lydia T. Liu Rong Ge Michael I. Jordan FedML 138 56 0 25 Mar 2018
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization Sanjeev Arora Nadav Cohen Elad Hazan 108 487 0 19 Feb 2018
Spurious Valleys in Two-layer Neural Network Optimization Landscapes Luca Venturi Afonso S. Bandeira Joan Bruna 60 74 0 18 Feb 2018
Stronger generalization bounds for deep nets via a compression approach Sanjeev Arora Rong Ge Behnam Neyshabur Yi Zhang MLT AI4CE 89 643 0 14 Feb 2018
To understand deep learning we need to understand kernel learning M. Belkin Siyuan Ma Soumik Mandal 72 420 0 05 Feb 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks Itay Safran Ohad Shamir 182 265 0 24 Dec 2017
Size-Independent Sample Complexity of Neural Networks Noah Golowich Alexander Rakhlin Ohad Shamir 154 551 0 18 Dec 2017
Learning One-hidden-layer Neural Networks with Landscape Design Rong Ge Jason D. Lee Tengyu Ma MLT 206 262 0 01 Nov 2017
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data Alon Brutzkus Amir Globerson Eran Malach Shai Shalev-Shwartz MLT 156 279 0 27 Oct 2017
A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks Behnam Neyshabur Srinadh Bhojanapalli Nathan Srebro 88 610 0 29 Jul 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks Mahdi Soltanolkotabi Adel Javanmard Jason D. Lee 185 423 0 16 Jul 2017
Spectrally-normalized margin bounds for neural networks Peter L. Bartlett Dylan J. Foster Matus Telgarsky ODL 212 1,225 0 26 Jun 2017
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks Levent Sagun Utku Evci V. U. Güney Yann N. Dauphin Léon Bottou 56 419 0 14 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks Kai Zhong Zhao Song Prateek Jain Peter L. Bartlett Inderjit S. Dhillon MLT 181 337 0 10 Jun 2017
Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk Paul Hand V. Voroninski UQCV 143 138 0 22 May 2017
The Landscape of Deep Learning Algorithms Pan Zhou Jiashi Feng 63 24 0 19 May 2017
Learning ReLUs via Gradient Descent Mahdi Soltanolkotabi MLT 86 183 0 10 May 2017
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs Alon Brutzkus Amir Globerson MLT 173 313 0 26 Feb 2017
Understanding deep learning requires rethinking generalization Chiyuan Zhang Samy Bengio Moritz Hardt Benjamin Recht Oriol Vinyals HAI 351 4,636 0 10 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys Pratik Chaudhari A. Choromańska Stefano Soatto Yann LeCun Carlo Baldassi C. Borgs J. Chayes Levent Sagun R. Zecchina ODL 96 774 0 06 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 433 2,946 0 15 Sep 2016
The Landscape of Empirical Risk for Non-convex Losses Song Mei Yu Bai Andrea Montanari 117 313 0 22 Jul 2016
No bad local minima: Data independent training error guarantees for multilayer neural networks Daniel Soudry Y. Carmon 202 236 0 26 May 2016