No bad local minima: Data independent training error guarantees for
multilayer neural networks

v1v2 (latest)

No bad local minima: Data independent training error guarantees for multilayer neural networks

26 May 2016

ArXiv (abs)PDF HTML

Papers citing "No bad local minima: Data independent training error guarantees for multilayer neural networks"

19 / 19 papers shown

Title
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 88 0 0 08 Feb 2024
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis Yuandong Tian MLT 197 217 0 02 Mar 2017
Exponentially vanishing sub-optimal local minima in multilayer neural networks Daniel Soudry Elad Hoffer 149 97 0 19 Feb 2017
Gradient Descent Converges to Minimizers Jason D. Lee Max Simchowitz Michael I. Jordan Benjamin Recht 71 212 0 16 Feb 2016
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 2.2K 194,426 0 10 Dec 2015
On the Quality of the Initial Basin in Overspecified Neural Networks Itay Safran Ohad Shamir 78 127 0 13 Nov 2015
Train faster, generalize better: Stability of stochastic gradient descent Moritz Hardt Benjamin Recht Y. Singer 116 1,243 0 03 Sep 2015
Global Optimality in Tensor Factorization, Deep Learning, and Beyond B. Haeffele René Vidal 185 150 0 24 Jun 2015
Empirical Evaluation of Rectified Activations in Convolutional Network Bing Xu Naiyan Wang Tianqi Chen Mu Li 140 2,913 0 05 May 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He Xinming Zhang Shaoqing Ren Jian Sun VLM 338 18,651 0 06 Feb 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 2.0K 150,312 0 22 Dec 2014
Qualitatively characterizing neural network optimization problems Ian Goodfellow Oriol Vinyals Andrew M. Saxe ODL 112 523 0 19 Dec 2014
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 263 1,200 0 30 Nov 2014
On the Computational Efficiency of Training Neural Networks Roi Livni Shai Shalev-Shwartz Ohad Shamir 146 480 0 05 Oct 2014
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization Yann N. Dauphin Razvan Pascanu Çağlar Gülçehre Kyunghyun Cho Surya Ganguli Yoshua Bengio ODL 129 1,389 0 10 Jun 2014
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe James L. McClelland Surya Ganguli ODL 185 1,852 0 20 Dec 2013
Smoothed Analysis of Tensor Decompositions Aditya Bhaskara Moses Charikar Ankur Moitra Aravindan Vijayaraghavan 152 155 0 14 Nov 2013
Improving neural networks by preventing co-adaptation of feature detectors Geoffrey E. Hinton Nitish Srivastava A. Krizhevsky Ilya Sutskever Ruslan Salakhutdinov VLM 460 7,667 0 03 Jul 2012
Identifiability of parameters in latent structure models with many observed variables E. Allman C. Matias J. Rhodes CML 159 534 0 29 Sep 2008