Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1406.2572
Cited By
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Kyunghyun Cho
Surya Ganguli
Yoshua Bengio
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"
50 / 216 papers shown
Title
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition
Chun Yang
Xu-Cheng Yin
Zejun Li
Jianwei Wu
Chunchao Guo
Hongfa Wang
Lei Xiao
24
10
0
10 Oct 2017
Natasha 2: Faster Non-Convex Optimization Than SGD
Zeyuan Allen-Zhu
ODL
28
245
0
29 Aug 2017
Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning
P. Jund
Andreas Eitel
N. Abdo
Wolfram Burgard
3DPC
16
19
0
04 Jul 2017
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning
Frank E. Curtis
K. Scheinberg
39
45
0
30 Jun 2017
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
P. Di Lorenzo
22
9
0
15 Jun 2017
Proximal Backpropagation
Thomas Frerix
Thomas Möllenhoff
Michael Möller
Daniel Cremers
23
31
0
14 Jun 2017
A Well-Tempered Landscape for Non-convex Robust Subspace Recovery
Tyler Maunu
Teng Zhang
Gilad Lerman
24
63
0
13 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
34
336
0
10 Jun 2017
Global Convergence of the (1+1) Evolution Strategy
Tobias Glasmachers
17
9
0
09 Jun 2017
Are Saddles Good Enough for Deep Learning?
Adepu Ravi Sankar
V. Balasubramanian
35
5
0
07 Jun 2017
Spectral Norm Regularization for Improving the Generalizability of Deep Learning
Yuichi Yoshida
Takeru Miyato
35
325
0
31 May 2017
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
41
795
0
24 May 2017
Sub-sampled Cubic Regularization for Non-convex Optimization
Jonas Köhler
Aurelien Lucchi
19
164
0
16 May 2017
Deep neural networks on graph signals for brain imaging analysis
Yiluan Guo
Hossein Nejati
Ngai-man Cheung
GNN
19
25
0
13 May 2017
The loss surface of deep and wide neural networks
Quynh N. Nguyen
Matthias Hein
ODL
51
283
0
26 Apr 2017
Deep Relaxation: partial differential equations for optimizing deep neural networks
Pratik Chaudhari
Adam M. Oberman
Stanley Osher
Stefano Soatto
G. Carlier
27
153
0
17 Apr 2017
Snapshot Ensembles: Train 1, get M for free
Gao Huang
Yixuan Li
Geoff Pleiss
Zhuang Liu
J. Hopcroft
Kilian Q. Weinberger
OOD
FedML
UQCV
45
935
0
01 Apr 2017
Failures of Gradient-Based Deep Learning
Shai Shalev-Shwartz
Ohad Shamir
Shaked Shammah
ODL
UQCV
34
198
0
23 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
46
757
0
15 Mar 2017
Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks
Nanyang Ye
Zhanxing Zhu
Rafał K. Mantiuk
19
20
0
13 Mar 2017
How to Escape Saddle Points Efficiently
Chi Jin
Rong Ge
Praneeth Netrapalli
Sham Kakade
Michael I. Jordan
ODL
37
831
0
02 Mar 2017
On the Origin of Deep Learning
Haohan Wang
Bhiksha Raj
MedIm
3DV
VLM
48
223
0
24 Feb 2017
An Introduction to Deep Learning for the Physical Layer
Tim O'Shea
J. Hoydis
AI4CE
89
2,172
0
02 Feb 2017
Convergence Results for Neural Networks via Electrodynamics
Rina Panigrahy
Sushant Sachdeva
Qiuyi Zhang
MLT
MDE
29
22
0
01 Feb 2017
Embedding Watermarks into Deep Neural Networks
Yusuke Uchida
Yuki Nagai
S. Sakazawa
Shiníchi Satoh
48
597
0
15 Jan 2017
An empirical analysis of the optimization of deep network loss surfaces
Daniel Jiwoong Im
Michael Tao
K. Branson
ODL
35
61
0
13 Dec 2016
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
32
228
0
22 Nov 2016
Local minima in training of neural networks
G. Swirszcz
Wojciech M. Czarnecki
Razvan Pascanu
ODL
29
73
0
19 Nov 2016
Low-rank Bilinear Pooling for Fine-Grained Classification
Shu Kong
Charless C. Fowlkes
28
344
0
16 Nov 2016
The Power of Normalization: Faster Evasion of Saddle Points
Kfir Y. Levy
22
108
0
15 Nov 2016
Identity Matters in Deep Learning
Moritz Hardt
Tengyu Ma
OOD
25
398
0
14 Nov 2016
Topology and Geometry of Half-Rectified Network Optimization
C. Freeman
Joan Bruna
19
233
0
04 Nov 2016
Demystifying ResNet
Sihan Li
Jiantao Jiao
Yanjun Han
Tsachy Weissman
30
38
0
03 Nov 2016
Finding Approximate Local Minima Faster than Gradient Descent
Naman Agarwal
Zeyuan Allen-Zhu
Brian Bullins
Elad Hazan
Tengyu Ma
41
83
0
03 Nov 2016
Master's Thesis : Deep Learning for Visual Recognition
Rémi Cadène
Nicolas Thome
Matthieu Cord
37
4
0
18 Oct 2016
An overview of gradient descent optimization algorithms
Sebastian Ruder
ODL
37
6,136
0
15 Sep 2016
Convexified Convolutional Neural Networks
Yuchen Zhang
Percy Liang
Martin J. Wainwright
26
64
0
04 Sep 2016
Mollifying Networks
Çağlar Gülçehre
Marcin Moczulski
Francesco Visin
Yoshua Bengio
23
46
0
17 Aug 2016
TerpreT: A Probabilistic Programming Language for Program Induction
Alexander L. Gaunt
Marc Brockschmidt
Rishabh Singh
Nate Kushman
Pushmeet Kohli
Jonathan Taylor
Daniel Tarlow
30
123
0
15 Aug 2016
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
Vardan Papyan
Yaniv Romano
Michael Elad
59
284
0
27 Jul 2016
On the Expressive Power of Deep Neural Networks
M. Raghu
Ben Poole
Jon M. Kleinberg
Surya Ganguli
Jascha Narain Sohl-Dickstein
29
777
0
16 Jun 2016
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
78
3,176
0
15 Jun 2016
CaMKII activation supports reward-based neural network optimization through Hamiltonian sampling
Zhaofei Yu
David Kappel
Robert Legenstein
Sen Song
Feng Chen
Wolfgang Maass
23
1
0
01 Jun 2016
No bad local minima: Data independent training error guarantees for multilayer neural networks
Daniel Soudry
Y. Carmon
19
235
0
26 May 2016
Deep Learning without Poor Local Minima
Kenji Kawaguchi
ODL
19
917
0
23 May 2016
Training Neural Networks Without Gradients: A Scalable ADMM Approach
Gavin Taylor
R. Burmeister
Zheng Xu
Bharat Singh
Ankit B. Patel
Tom Goldstein
ODL
16
272
0
06 May 2016
Deep Learning in Bioinformatics
Seonwoo Min
Byunghan Lee
Sungroh Yoon
AI4CE
3DV
36
1,351
0
21 Mar 2016
DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters
Hanjoo Kim
Jaehong Park
Jaehee Jang
Sungroh Yoon
BDL
32
37
0
26 Feb 2016
Communication-Efficient Learning of Deep Networks from Decentralized Data
H. B. McMahan
Eider Moore
Daniel Ramage
S. Hampson
Blaise Agüera y Arcas
FedML
26
17,032
0
17 Feb 2016
Gradient Descent Converges to Minimizers
J. Lee
Max Simchowitz
Michael I. Jordan
Benjamin Recht
32
212
0
16 Feb 2016
Previous
1
2
3
4
5
Next