ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2572
  4. Cited By
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Kyunghyun Cho
Surya Ganguli
Yoshua Bengio
    ODL
ArXivPDFHTML

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 216 papers shown
Title
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text
  Recognition
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition
Chun Yang
Xu-Cheng Yin
Zejun Li
Jianwei Wu
Chunchao Guo
Hongfa Wang
Lei Xiao
24
10
0
10 Oct 2017
Natasha 2: Faster Non-Convex Optimization Than SGD
Natasha 2: Faster Non-Convex Optimization Than SGD
Zeyuan Allen-Zhu
ODL
28
245
0
29 Aug 2017
Optimization Beyond the Convolution: Generalizing Spatial Relations with
  End-to-End Metric Learning
Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning
P. Jund
Andreas Eitel
N. Abdo
Wolfram Burgard
3DPC
16
19
0
04 Jul 2017
Optimization Methods for Supervised Machine Learning: From Linear Models
  to Deep Learning
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning
Frank E. Curtis
K. Scheinberg
39
45
0
30 Jun 2017
Stochastic Training of Neural Networks via Successive Convex
  Approximations
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
P. Di Lorenzo
22
9
0
15 Jun 2017
Proximal Backpropagation
Proximal Backpropagation
Thomas Frerix
Thomas Möllenhoff
Michael Möller
Daniel Cremers
23
31
0
14 Jun 2017
A Well-Tempered Landscape for Non-convex Robust Subspace Recovery
A Well-Tempered Landscape for Non-convex Robust Subspace Recovery
Tyler Maunu
Teng Zhang
Gilad Lerman
24
63
0
13 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
34
336
0
10 Jun 2017
Global Convergence of the (1+1) Evolution Strategy
Global Convergence of the (1+1) Evolution Strategy
Tobias Glasmachers
17
9
0
09 Jun 2017
Are Saddles Good Enough for Deep Learning?
Are Saddles Good Enough for Deep Learning?
Adepu Ravi Sankar
V. Balasubramanian
35
5
0
07 Jun 2017
Spectral Norm Regularization for Improving the Generalizability of Deep
  Learning
Spectral Norm Regularization for Improving the Generalizability of Deep Learning
Yuichi Yoshida
Takeru Miyato
35
325
0
31 May 2017
Train longer, generalize better: closing the generalization gap in large
  batch training of neural networks
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
41
795
0
24 May 2017
Sub-sampled Cubic Regularization for Non-convex Optimization
Sub-sampled Cubic Regularization for Non-convex Optimization
Jonas Köhler
Aurelien Lucchi
19
164
0
16 May 2017
Deep neural networks on graph signals for brain imaging analysis
Deep neural networks on graph signals for brain imaging analysis
Yiluan Guo
Hossein Nejati
Ngai-man Cheung
GNN
19
25
0
13 May 2017
The loss surface of deep and wide neural networks
The loss surface of deep and wide neural networks
Quynh N. Nguyen
Matthias Hein
ODL
51
283
0
26 Apr 2017
Deep Relaxation: partial differential equations for optimizing deep
  neural networks
Deep Relaxation: partial differential equations for optimizing deep neural networks
Pratik Chaudhari
Adam M. Oberman
Stanley Osher
Stefano Soatto
G. Carlier
27
153
0
17 Apr 2017
Snapshot Ensembles: Train 1, get M for free
Snapshot Ensembles: Train 1, get M for free
Gao Huang
Yixuan Li
Geoff Pleiss
Zhuang Liu
J. Hopcroft
Kilian Q. Weinberger
OOD
FedML
UQCV
45
935
0
01 Apr 2017
Failures of Gradient-Based Deep Learning
Failures of Gradient-Based Deep Learning
Shai Shalev-Shwartz
Ohad Shamir
Shaked Shammah
ODL
UQCV
34
198
0
23 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
46
757
0
15 Mar 2017
Langevin Dynamics with Continuous Tempering for Training Deep Neural
  Networks
Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks
Nanyang Ye
Zhanxing Zhu
Rafał K. Mantiuk
19
20
0
13 Mar 2017
How to Escape Saddle Points Efficiently
How to Escape Saddle Points Efficiently
Chi Jin
Rong Ge
Praneeth Netrapalli
Sham Kakade
Michael I. Jordan
ODL
37
831
0
02 Mar 2017
On the Origin of Deep Learning
On the Origin of Deep Learning
Haohan Wang
Bhiksha Raj
MedIm
3DV
VLM
48
223
0
24 Feb 2017
An Introduction to Deep Learning for the Physical Layer
An Introduction to Deep Learning for the Physical Layer
Tim O'Shea
J. Hoydis
AI4CE
89
2,172
0
02 Feb 2017
Convergence Results for Neural Networks via Electrodynamics
Convergence Results for Neural Networks via Electrodynamics
Rina Panigrahy
Sushant Sachdeva
Qiuyi Zhang
MLT
MDE
29
22
0
01 Feb 2017
Embedding Watermarks into Deep Neural Networks
Embedding Watermarks into Deep Neural Networks
Yusuke Uchida
Yuki Nagai
S. Sakazawa
Shiníchi Satoh
48
597
0
15 Jan 2017
An empirical analysis of the optimization of deep network loss surfaces
An empirical analysis of the optimization of deep network loss surfaces
Daniel Jiwoong Im
Michael Tao
K. Branson
ODL
35
61
0
13 Dec 2016
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
32
228
0
22 Nov 2016
Local minima in training of neural networks
Local minima in training of neural networks
G. Swirszcz
Wojciech M. Czarnecki
Razvan Pascanu
ODL
29
73
0
19 Nov 2016
Low-rank Bilinear Pooling for Fine-Grained Classification
Low-rank Bilinear Pooling for Fine-Grained Classification
Shu Kong
Charless C. Fowlkes
28
344
0
16 Nov 2016
The Power of Normalization: Faster Evasion of Saddle Points
The Power of Normalization: Faster Evasion of Saddle Points
Kfir Y. Levy
22
108
0
15 Nov 2016
Identity Matters in Deep Learning
Identity Matters in Deep Learning
Moritz Hardt
Tengyu Ma
OOD
25
398
0
14 Nov 2016
Topology and Geometry of Half-Rectified Network Optimization
Topology and Geometry of Half-Rectified Network Optimization
C. Freeman
Joan Bruna
19
233
0
04 Nov 2016
Demystifying ResNet
Demystifying ResNet
Sihan Li
Jiantao Jiao
Yanjun Han
Tsachy Weissman
30
38
0
03 Nov 2016
Finding Approximate Local Minima Faster than Gradient Descent
Finding Approximate Local Minima Faster than Gradient Descent
Naman Agarwal
Zeyuan Allen-Zhu
Brian Bullins
Elad Hazan
Tengyu Ma
41
83
0
03 Nov 2016
Master's Thesis : Deep Learning for Visual Recognition
Master's Thesis : Deep Learning for Visual Recognition
Rémi Cadène
Nicolas Thome
Matthieu Cord
37
4
0
18 Oct 2016
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
Sebastian Ruder
ODL
37
6,136
0
15 Sep 2016
Convexified Convolutional Neural Networks
Convexified Convolutional Neural Networks
Yuchen Zhang
Percy Liang
Martin J. Wainwright
26
64
0
04 Sep 2016
Mollifying Networks
Mollifying Networks
Çağlar Gülçehre
Marcin Moczulski
Francesco Visin
Yoshua Bengio
23
46
0
17 Aug 2016
TerpreT: A Probabilistic Programming Language for Program Induction
TerpreT: A Probabilistic Programming Language for Program Induction
Alexander L. Gaunt
Marc Brockschmidt
Rishabh Singh
Nate Kushman
Pushmeet Kohli
Jonathan Taylor
Daniel Tarlow
30
123
0
15 Aug 2016
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
Vardan Papyan
Yaniv Romano
Michael Elad
59
284
0
27 Jul 2016
On the Expressive Power of Deep Neural Networks
On the Expressive Power of Deep Neural Networks
M. Raghu
Ben Poole
Jon M. Kleinberg
Surya Ganguli
Jascha Narain Sohl-Dickstein
29
777
0
16 Jun 2016
Optimization Methods for Large-Scale Machine Learning
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
78
3,176
0
15 Jun 2016
CaMKII activation supports reward-based neural network optimization
  through Hamiltonian sampling
CaMKII activation supports reward-based neural network optimization through Hamiltonian sampling
Zhaofei Yu
David Kappel
Robert Legenstein
Sen Song
Feng Chen
Wolfgang Maass
23
1
0
01 Jun 2016
No bad local minima: Data independent training error guarantees for
  multilayer neural networks
No bad local minima: Data independent training error guarantees for multilayer neural networks
Daniel Soudry
Y. Carmon
19
235
0
26 May 2016
Deep Learning without Poor Local Minima
Deep Learning without Poor Local Minima
Kenji Kawaguchi
ODL
19
917
0
23 May 2016
Training Neural Networks Without Gradients: A Scalable ADMM Approach
Training Neural Networks Without Gradients: A Scalable ADMM Approach
Gavin Taylor
R. Burmeister
Zheng Xu
Bharat Singh
Ankit B. Patel
Tom Goldstein
ODL
16
272
0
06 May 2016
Deep Learning in Bioinformatics
Deep Learning in Bioinformatics
Seonwoo Min
Byunghan Lee
Sungroh Yoon
AI4CE
3DV
36
1,351
0
21 Mar 2016
DeepSpark: A Spark-Based Distributed Deep Learning Framework for
  Commodity Clusters
DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters
Hanjoo Kim
Jaehong Park
Jaehee Jang
Sungroh Yoon
BDL
32
37
0
26 Feb 2016
Communication-Efficient Learning of Deep Networks from Decentralized
  Data
Communication-Efficient Learning of Deep Networks from Decentralized Data
H. B. McMahan
Eider Moore
Daniel Ramage
S. Hampson
Blaise Agüera y Arcas
FedML
26
17,032
0
17 Feb 2016
Gradient Descent Converges to Minimizers
Gradient Descent Converges to Minimizers
J. Lee
Max Simchowitz
Michael I. Jordan
Benjamin Recht
32
212
0
16 Feb 2016
Previous
12345
Next