Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization

14 August 2018

Papers citing "Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization"

43 / 43 papers shown

Title
Linearization of ReLU Activation Function for Neural Network-Embedded Optimization: Optimal Day-Ahead Energy Scheduling Cunzhi Zhao Fan Jiang Xingpeng Li 99 1 0 03 Oct 2023
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss T. Getu Georges Kaddoum M. Bennis 71 1 0 13 Sep 2023
On the Power and Limitations of Random Features for Understanding Neural Networks Gilad Yehudai Ohad Shamir MLT 66 182 0 01 Apr 2019
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks Samet Oymak Mahdi Soltanolkotabi 48 321 0 12 Feb 2019
On Connected Sublevel Sets in Deep Learning Quynh N. Nguyen 86 102 0 22 Jan 2019
Fitting ReLUs via SGD and Quantized SGD Seyed Mohammadreza Mousavi Kalan Mahdi Soltanolkotabi A. Avestimehr 47 24 0 19 Jan 2019
Elimination of All Bad Local Minima in Deep Learning Kenji Kawaguchi L. Kaelbling 59 44 0 02 Jan 2019
Real-time Power System State Estimation and Forecasting via Deep Neural Networks Liang Zhang G. Wang G. Giannakis AI4TS 59 185 0 15 Nov 2018
Efficiently testing local optimality and escaping saddles for ReLU networks Chulhee Yun S. Sra Ali Jadbabaie 55 10 0 28 Sep 2018
Stochastic Gradient Descent Learns State Equations with Nonlinear Activations Samet Oymak 43 43 0 09 Sep 2018
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data Yuanzhi Li Yingyu Liang MLT 214 653 0 03 Aug 2018
Learning ReLU Networks via Alternating Minimization Gauri Jagatap Chinmay Hegde 36 11 0 20 Jun 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent Xiao Zhang Yaodong Yu Lingxiao Wang Quanquan Gu MLT 94 134 0 20 Jun 2018
On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond Xingguo Li Junwei Lu Zhaoran Wang Jarvis Haupt T. Zhao 52 80 0 13 Jun 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models? Tengyu Xu Yi Zhou Kaiyi Ji Yingbin Liang 56 19 0 12 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate Mor Shpigel Nacson Nathan Srebro Daniel Soudry FedML MLT 63 100 0 05 Jun 2018
Adding One Neuron Can Eliminate All Bad Local Minima Shiyu Liang Ruoyu Sun Jason D. Lee R. Srikant 68 89 0 22 May 2018
On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions Songtao Lu Mingyi Hong Zhengdao Wang 20 4 0 28 Feb 2018
Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy H. Fu Yuejie Chi Yingbin Liang FedML 64 39 0 18 Feb 2018
Small nonlinearities in activation functions create bad local minima in neural networks Chulhee Yun S. Sra Ali Jadbabaie ODL 71 95 0 10 Feb 2018
The Multilinear Structure of ReLU Networks T. Laurent J. V. Brecht 55 51 0 29 Dec 2017
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks Itay Safran Ohad Shamir 173 263 0 24 Dec 2017
Deep linear neural networks with arbitrary loss: All local minima are global T. Laurent J. V. Brecht ODL 52 136 0 05 Dec 2017
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima S. Du Jason D. Lee Yuandong Tian Barnabás Póczós Aarti Singh MLT 130 236 0 03 Dec 2017
The Implicit Bias of Gradient Descent on Separable Data Daniel Soudry Elad Hoffer Mor Shpigel Nacson Suriya Gunasekar Nathan Srebro 149 916 0 27 Oct 2017
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data Alon Brutzkus Amir Globerson Eran Malach Shai Shalev-Shwartz MLT 151 279 0 27 Oct 2017
Generalization in Deep Learning Kenji Kawaguchi L. Kaelbling Yoshua Bengio ODL 86 459 0 16 Oct 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks Mahdi Soltanolkotabi Adel Javanmard Jason D. Lee 163 419 0 16 Jul 2017
Exploring Generalization in Deep Learning Behnam Neyshabur Srinadh Bhojanapalli David A. McAllester Nathan Srebro FAtt 141 1,255 0 27 Jun 2017
Spectrally-normalized margin bounds for neural networks Peter L. Bartlett Dylan J. Foster Matus Telgarsky ODL 199 1,217 0 26 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks Kai Zhong Zhao Song Prateek Jain Peter L. Bartlett Inderjit S. Dhillon MLT 163 336 0 10 Jun 2017
Learning ReLUs via Gradient Descent Mahdi Soltanolkotabi MLT 68 181 0 10 May 2017
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data Gintare Karolina Dziugaite Daniel M. Roy 106 812 0 31 Mar 2017
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs Alon Brutzkus Amir Globerson MLT 163 313 0 26 Feb 2017
Exponential expressivity in deep neural networks through transient chaos Ben Poole Subhaneil Lahiri M. Raghu Jascha Narain Sohl-Dickstein Surya Ganguli 88 591 0 16 Jun 2016
On the Expressive Power of Deep Neural Networks M. Raghu Ben Poole Jon M. Kleinberg Surya Ganguli Jascha Narain Sohl-Dickstein 61 788 0 16 Jun 2016
Solving Systems of Random Quadratic Equations via Truncated Amplitude Flow G. Wang G. Giannakis Yonina C. Eldar 80 362 0 26 May 2016
Deep Learning without Poor Local Minima Kenji Kawaguchi ODL 215 922 0 23 May 2016
Benefits of depth in neural networks Matus Telgarsky 346 608 0 14 Feb 2016
Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition Rong Ge Furong Huang Chi Jin Yang Yuan 135 1,058 0 06 Mar 2015
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning Behnam Neyshabur Ryota Tomioka Nathan Srebro AI4CE 88 657 0 20 Dec 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan Andrew Zisserman FAtt MDE 1.6K 100,330 0 04 Sep 2014
Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation Yoshua Bengio Nicholas Léonard Aaron Courville 374 3,128 0 15 Aug 2013