Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

26 February 2017

Papers citing "Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs"

37 / 87 papers shown

Title
Width Provably Matters in Optimization for Deep Linear Neural Networks S. Du Wei Hu 21 94 0 24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks Difan Zou Yuan Cao Dongruo Zhou Quanquan Gu ODL 33 446 0 21 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks S. Du J. Lee Haochuan Li Liwei Wang Masayoshi Tomizuka ODL 44 1,122 0 09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks Zeyuan Allen-Zhu Yuanzhi Li Zhao Song 23 191 0 29 Oct 2018
Subgradient Descent Learns Orthogonal Dictionaries Yu Bai Qijia Jiang Ju Sun 20 51 0 25 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity Chulhee Yun S. Sra Ali Jadbabaie 26 117 0 17 Oct 2018
Learning Two-layer Neural Networks with Symmetric Inputs Rong Ge Rohith Kuditipudi Zhize Li Xiang Wang OOD MLT 36 57 0 16 Oct 2018
Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem Alon Brutzkus Amir Globerson MLT 11 7 0 06 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks Sanjeev Arora Nadav Cohen Noah Golowich Wei Hu 27 281 0 04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 53 1,250 0 04 Oct 2018
Towards Understanding Regularization in Batch Normalization Ping Luo Xinjiang Wang Wenqi Shao Zhanglin Peng MLT AI4CE 23 179 0 04 Sep 2018
Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks Penghang Yin Shuai Zhang J. Lyu Stanley Osher Y. Qi Jack Xin MQ 44 61 0 15 Aug 2018
Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization G. Wang G. Giannakis Jie Chen MLT 24 131 0 14 Aug 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent Xiao Zhang Yaodong Yu Lingxiao Wang Quanquan Gu MLT 28 134 0 20 Jun 2018
Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex Hongyang R. Zhang Junru Shao Ruslan Salakhutdinov 39 14 0 06 Jun 2018
Adding One Neuron Can Eliminate All Bad Local Minima Shiyu Liang Ruoyu Sun J. Lee R. Srikant 37 89 0 22 May 2018
How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network? S. Du Yining Wang Xiyu Zhai Sivaraman Balakrishnan Ruslan Salakhutdinov Aarti Singh SSL 21 57 0 21 May 2018
Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps S. Du Surbhi Goel MLT 30 17 0 20 May 2018
End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition Samet Oymak Mahdi Soltanolkotabi 21 12 0 16 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks Song Mei Andrea Montanari Phan-Minh Nguyen MLT 43 850 0 18 Apr 2018
Understanding the Loss Surface of Neural Networks for Binary Classification Shiyu Liang Ruoyu Sun Yixuan Li R. Srikant 21 87 0 19 Feb 2018
Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks Peter L. Bartlett D. Helmbold Philip M. Long 33 116 0 16 Feb 2018
Learning One Convolutional Layer with Overlapping Patches Surbhi Goel Adam R. Klivans Raghu Meka MLT 16 80 0 07 Feb 2018
Learning Compact Neural Networks with Regularization Samet Oymak MLT 41 39 0 05 Feb 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks Itay Safran Ohad Shamir 40 261 0 24 Dec 2017
Non-convex Optimization for Machine Learning Prateek Jain Purushottam Kar 33 479 0 21 Dec 2017
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data Alon Brutzkus Amir Globerson Eran Malach Shai Shalev-Shwartz MLT 50 276 0 27 Oct 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks Mahdi Soltanolkotabi Adel Javanmard J. Lee 36 415 0 16 Jul 2017
Recovery Guarantees for One-hidden-layer Neural Networks Kai Zhong Zhao Song Prateek Jain Peter L. Bartlett Inderjit S. Dhillon MLT 34 336 0 10 Jun 2017
On the stable recovery of deep structured linear networks under sparsity constraints F. Malgouyres 24 7 0 31 May 2017
Learning ReLUs via Gradient Descent Mahdi Soltanolkotabi MLT 20 181 0 10 May 2017
Estimating the Coefficients of a Mixture of Two Linear Regressions by Expectation Maximization Jason M. Klusowski Dana Yang W. Brinda 34 41 0 26 Apr 2017
The loss surface of deep and wide neural networks Quynh N. Nguyen Matthias Hein ODL 51 283 0 26 Apr 2017
Convergence Results for Neural Networks via Electrodynamics Rina Panigrahy Sushant Sachdeva Qiuyi Zhang MLT MDE 29 22 0 01 Feb 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,746 0 26 Sep 2016
$Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $ \ell^1 $ and $ \ell^0 $ Controls$ Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $\ell^1$ and $\ell^0$ Controls Jason M. Klusowski Andrew R. Barron 132 142 0 26 Jul 2016
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 183 1,185 0 30 Nov 2014