ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2572
  4. Cited By
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Kyunghyun Cho
Surya Ganguli
Yoshua Bengio
    ODL
ArXivPDFHTML

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 233 papers shown
Title
diffGrad: An Optimization Method for Convolutional Neural Networks
diffGrad: An Optimization Method for Convolutional Neural Networks
S. Dubey
Soumendu Chakraborty
Swalpa Kumar Roy
Snehasis Mukherjee
S. Singh
B. B. Chaudhuri
ODL
97
183
0
12 Sep 2019
Solving Continual Combinatorial Selection via Deep Reinforcement
  Learning
Solving Continual Combinatorial Selection via Deep Reinforcement Learning
Hyungseok Song
Hyeryung Jang
H. Tran
Se-eun Yoon
Kyunghwan Son
Donggyu Yun
Hyoju Chung
Yung Yi
18
10
0
09 Sep 2019
Distributed Gradient Descent: Nonconvergence to Saddle Points and the
  Stable-Manifold Theorem
Distributed Gradient Descent: Nonconvergence to Saddle Points and the Stable-Manifold Theorem
Brian Swenson
Ryan W. Murray
H. Vincent Poor
S. Kar
26
14
0
07 Aug 2019
Weight-space symmetry in deep networks gives rise to permutation
  saddles, connected by equal-loss valleys across the loss landscape
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
Johanni Brea
Berfin Simsek
Bernd Illing
W. Gerstner
23
55
0
05 Jul 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Kaipeng Zhang
Alec Koppel
Haoqi Zhu
Tamer Basar
44
186
0
19 Jun 2019
Orthogonal Deep Neural Networks
Orthogonal Deep Neural Networks
Kui Jia
Shuai Li
Yuxin Wen
Tongliang Liu
Dacheng Tao
36
132
0
15 May 2019
Annealing for Distributed Global Optimization
Annealing for Distributed Global Optimization
Brian Swenson
S. Kar
H. Vincent Poor
J. M. F. Moura
25
30
0
18 Mar 2019
Nonlinear Approximation via Compositions
Nonlinear Approximation via Compositions
Zuowei Shen
Haizhao Yang
Shijun Zhang
26
92
0
26 Feb 2019
Understanding and Controlling Memory in Recurrent Neural Networks
Understanding and Controlling Memory in Recurrent Neural Networks
Doron Haviv
A. Rivkind
O. Barak
19
20
0
19 Feb 2019
Parameter Efficient Training of Deep Convolutional Neural Networks by
  Dynamic Sparse Reparameterization
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
Hesham Mostafa
Xin Wang
37
307
0
15 Feb 2019
Negative eigenvalues of the Hessian in deep neural networks
Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain
Nicolas Le Roux
Pierre-Antoine Manzagol
19
42
0
06 Feb 2019
Improving Adversarial Robustness via Promoting Ensemble Diversity
Improving Adversarial Robustness via Promoting Ensemble Diversity
Tianyu Pang
Kun Xu
Chao Du
Ning Chen
Jun Zhu
AAML
41
434
0
25 Jan 2019
A Deterministic Gradient-Based Approach to Avoid Saddle Points
A Deterministic Gradient-Based Approach to Avoid Saddle Points
L. Kreusser
Stanley J. Osher
Bao Wang
ODL
32
3
0
21 Jan 2019
Overfitting Mechanism and Avoidance in Deep Neural Networks
Overfitting Mechanism and Avoidance in Deep Neural Networks
Shaeke Salman
Xiuwen Liu
9
139
0
19 Jan 2019
Visualising Basins of Attraction for the Cross-Entropy and the Squared
  Error Neural Network Loss Functions
Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions
Anna Sergeevna Bosman
A. Engelbrecht
Mardé Helbig
14
76
0
08 Jan 2019
Scaling description of generalization with number of parameters in deep
  learning
Scaling description of generalization with number of parameters in deep learning
Mario Geiger
Arthur Jacot
S. Spigler
Franck Gabriel
Levent Sagun
Stéphane dÁscoli
Giulio Biroli
Clément Hongler
M. Wyart
52
195
0
06 Jan 2019
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks
Henning Petzka
C. Sminchisescu
29
9
0
16 Dec 2018
Gradient Descent Happens in a Tiny Subspace
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
30
229
0
12 Dec 2018
Towards Theoretical Understanding of Large Batch Training in Stochastic
  Gradient Descent
Towards Theoretical Understanding of Large Batch Training in Stochastic Gradient Descent
Xiaowu Dai
Yuhua Zhu
25
11
0
03 Dec 2018
Shared Representational Geometry Across Neural Networks
Shared Representational Geometry Across Neural Networks
Qihong Lu
Po-Hsuan Chen
Jonathan W. Pillow
Peter J. Ramadge
K. A. Norman
Uri Hasson
OOD
21
11
0
28 Nov 2018
Learning Attractor Dynamics for Generative Memory
Learning Attractor Dynamics for Generative Memory
Yan Wu
Greg Wayne
Karol Gregor
Timothy Lillicrap
BDL
19
18
0
23 Nov 2018
Online Embedding Compression for Text Classification using Low Rank
  Matrix Factorization
Online Embedding Compression for Text Classification using Low Rank Matrix Factorization
Anish Acharya
Rahul Goel
A. Metallinou
Inderjit Dhillon
19
58
0
01 Nov 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from
  Random Matrix Theory and Implications for Learning
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
47
192
0
02 Oct 2018
A theoretical framework for deep locally connected ReLU network
A theoretical framework for deep locally connected ReLU network
Yuandong Tian
PINN
25
10
0
28 Sep 2018
On the loss landscape of a class of deep neural networks with no bad
  local valleys
On the loss landscape of a class of deep neural networks with no bad local valleys
Quynh N. Nguyen
Mahesh Chandra Mukkamala
Matthias Hein
16
87
0
27 Sep 2018
A Deep Learning Framework for Unsupervised Affine and Deformable Image
  Registration
A Deep Learning Framework for Unsupervised Affine and Deformable Image Registration
B. D. de Vos
F. Berendsen
M. Viergever
Hessam Sokooti
Marius Staring
Ivana Isgum
MedIm
25
675
0
17 Sep 2018
Hubless keypoint-based 3D deformable groupwise registration
Hubless keypoint-based 3D deformable groupwise registration
Rémi Agier
S. Valette
R. Kéchichian
L. Fanton
R. Prost
13
10
0
11 Sep 2018
Troubling Trends in Machine Learning Scholarship
Troubling Trends in Machine Learning Scholarship
Zachary Chase Lipton
Jacob Steinhardt
29
288
0
09 Jul 2018
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path
  Integrated Differential Estimator
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Cong Fang
C. J. Li
Zhouchen Lin
Tong Zhang
50
570
0
04 Jul 2018
Algorithms for solving optimization problems arising from deep neural
  net models: smooth problems
Algorithms for solving optimization problems arising from deep neural net models: smooth problems
Vyacheslav Kungurtsev
Tomás Pevný
26
6
0
30 Jun 2018
PCA of high dimensional random walks with comparison to neural network
  training
PCA of high dimensional random walks with comparison to neural network training
J. Antognini
Jascha Narain Sohl-Dickstein
OOD
24
27
0
22 Jun 2018
Stochastic Nested Variance Reduction for Nonconvex Optimization
Stochastic Nested Variance Reduction for Nonconvex Optimization
Dongruo Zhou
Pan Xu
Quanquan Gu
25
146
0
20 Jun 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
51
3,117
0
20 Jun 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean
  Field Approach
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Ryo Karakida
S. Akaho
S. Amari
FedML
47
140
0
04 Jun 2018
Understanding Generalization and Optimization Performance of Deep CNNs
Understanding Generalization and Optimization Performance of Deep CNNs
Pan Zhou
Jiashi Feng
MLT
19
48
0
28 May 2018
Entropy and mutual information in models of deep neural networks
Entropy and mutual information in models of deep neural networks
Marylou Gabrié
Andre Manoel
Clément Luneau
Jean Barbier
N. Macris
Florent Krzakala
Lenka Zdeborová
30
179
0
24 May 2018
Universal discriminative quantum neural networks
Universal discriminative quantum neural networks
Hongxiang Chen
Leonard Wossnig
Simone Severini
Hartmut Neven
Masoud Mohseni
19
80
0
22 May 2018
Local Saddle Point Optimization: A Curvature Exploitation Approach
Local Saddle Point Optimization: A Curvature Exploitation Approach
Leonard Adolphs
Hadi Daneshmand
Aurelien Lucchi
Thomas Hofmann
37
107
0
15 May 2018
Measuring the Intrinsic Dimension of Objective Landscapes
Measuring the Intrinsic Dimension of Objective Landscapes
Chunyuan Li
Heerad Farkhoor
Rosanne Liu
J. Yosinski
26
398
0
24 Apr 2018
On Gradient-Based Learning in Continuous Games
On Gradient-Based Learning in Continuous Games
Eric Mazumdar
Lillian J. Ratliff
S. Shankar Sastry
22
134
0
16 Apr 2018
The Loss Surface of XOR Artificial Neural Networks
The Loss Surface of XOR Artificial Neural Networks
D. Mehta
Xiaojun Zhao
Edgar A. Bernal
D. Wales
34
19
0
06 Apr 2018
DeepSigns: A Generic Watermarking Framework for IP Protection of Deep
  Learning Models
DeepSigns: A Generic Watermarking Framework for IP Protection of Deep Learning Models
B. Rouhani
Huili Chen
F. Koushanfar
40
48
0
02 Apr 2018
A Survey on Deep Learning Methods for Robot Vision
A Survey on Deep Learning Methods for Robot Vision
Javier Ruiz-del-Solar
P. Loncomilla
Naiomi Soto
31
60
0
28 Mar 2018
Comparing Dynamics: Deep Neural Networks versus Glassy Systems
Comparing Dynamics: Deep Neural Networks versus Glassy Systems
Marco Baity-Jesi
Levent Sagun
Mario Geiger
S. Spigler
Gerard Ben Arous
C. Cammarota
Yann LeCun
M. Wyart
Giulio Biroli
AI4CE
42
113
0
19 Mar 2018
Replica Symmetry Breaking in Bipartite Spin Glasses and Neural Networks
Replica Symmetry Breaking in Bipartite Spin Glasses and Neural Networks
Gavin Hartnett
Edward Parker
Edward Geist
13
23
0
17 Mar 2018
Escaping Saddles with Stochastic Gradients
Escaping Saddles with Stochastic Gradients
Hadi Daneshmand
Jonas Köhler
Aurelien Lucchi
Thomas Hofmann
24
162
0
15 Mar 2018
Essentially No Barriers in Neural Network Energy Landscape
Essentially No Barriers in Neural Network Energy Landscape
Felix Dräxler
K. Veschgini
M. Salmhofer
Fred Hamprecht
MoMe
22
424
0
02 Mar 2018
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
T. Garipov
Pavel Izmailov
Dmitrii Podoprikhin
Dmitry Vetrov
A. Wilson
UQCV
25
734
0
27 Feb 2018
A Walk with SGD
A Walk with SGD
Chen Xing
Devansh Arpit
Christos Tsirigotis
Yoshua Bengio
27
118
0
24 Feb 2018
Understanding the Loss Surface of Neural Networks for Binary
  Classification
Understanding the Loss Surface of Neural Networks for Binary Classification
Shiyu Liang
Ruoyu Sun
Yixuan Li
R. Srikant
35
87
0
19 Feb 2018
Previous
12345
Next