ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.10412
  4. Cited By
Gradient Descent Can Take Exponential Time to Escape Saddle Points

Gradient Descent Can Take Exponential Time to Escape Saddle Points

29 May 2017
S. Du
Chi Jin
Jason D. Lee
Michael I. Jordan
Barnabás Póczós
Aarti Singh
ArXivPDFHTML

Papers citing "Gradient Descent Can Take Exponential Time to Escape Saddle Points"

48 / 48 papers shown
Title
Nesterov acceleration in benignly non-convex landscapes
Nesterov acceleration in benignly non-convex landscapes
Kanan Gupta
Stephan Wojtowytsch
42
2
0
10 Oct 2024
Mask in the Mirror: Implicit Sparsification
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
49
3
0
19 Aug 2024
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing:
  The Curses of Symmetry and Initialization
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
48
12
0
03 Oct 2023
Memory-Query Tradeoffs for Randomized Convex Optimization
Memory-Query Tradeoffs for Randomized Convex Optimization
Xinyu Chen
Binghui Peng
36
6
0
21 Jun 2023
Almost Sure Saddle Avoidance of Stochastic Gradient Methods without the
  Bounded Gradient Assumption
Almost Sure Saddle Avoidance of Stochastic Gradient Methods without the Bounded Gradient Assumption
Jun Liu
Ye Yuan
ODL
19
1
0
15 Feb 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
25
7
0
03 Feb 2023
Exploring the Effect of Multi-step Ascent in Sharpness-Aware
  Minimization
Exploring the Effect of Multi-step Ascent in Sharpness-Aware Minimization
Hoki Kim
Jinseong Park
Yujin Choi
Woojin Lee
Jaewook Lee
20
9
0
27 Jan 2023
Stability Analysis of Sharpness-Aware Minimization
Stability Analysis of Sharpness-Aware Minimization
Hoki Kim
Jinseong Park
Yujin Choi
Jaewook Lee
39
12
0
16 Jan 2023
Decentralized Nonconvex Optimization with Guaranteed Privacy and
  Accuracy
Decentralized Nonconvex Optimization with Guaranteed Privacy and Accuracy
Yongqiang Wang
Tamer Basar
31
21
0
14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast
  Evasion of Non-Degenerate Saddle Points
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
33
2
0
07 Dec 2022
Gradient Descent and the Power Method: Exploiting their connection to
  find the leftmost eigen-pair and escape saddle points
Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points
R. Tappenden
Martin Takáč
18
0
0
02 Nov 2022
Stochastic noise can be helpful for variational quantum algorithms
Stochastic noise can be helpful for variational quantum algorithms
Junyu Liu
Frederik Wilde
A. A. Mele
Liang Jiang
Jens Eisert
Jens Eisert
26
34
0
13 Oct 2022
Zeroth-Order Negative Curvature Finding: Escaping Saddle Points without
  Gradients
Zeroth-Order Negative Curvature Finding: Escaping Saddle Points without Gradients
Hualin Zhang
Huan Xiong
Bin Gu
35
7
0
04 Oct 2022
Nonconvex Matrix Factorization is Geodesically Convex: Global Landscape
  Analysis for Fixed-rank Matrix Optimization From a Riemannian Perspective
Nonconvex Matrix Factorization is Geodesically Convex: Global Landscape Analysis for Fixed-rank Matrix Optimization From a Riemannian Perspective
Yuetian Luo
Nicolas García Trillos
24
6
0
29 Sep 2022
Neural Collapse with Normalized Features: A Geometric Analysis over the
  Riemannian Manifold
Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold
Can Yaras
Peng Wang
Zhihui Zhu
Laura Balzano
Qing Qu
25
42
0
19 Sep 2022
Gradient descent provably escapes saddle points in the training of
  shallow ReLU networks
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
36
5
0
03 Aug 2022
Gradient Descent, Stochastic Optimization, and Other Tales
Gradient Descent, Stochastic Optimization, and Other Tales
Jun Lu
22
8
0
02 May 2022
Randomly Initialized Alternating Least Squares: Fast Convergence for
  Matrix Sensing
Randomly Initialized Alternating Least Squares: Fast Convergence for Matrix Sensing
Kiryung Lee
Dominik Stöger
31
11
0
25 Apr 2022
Training Fully Connected Neural Networks is $\exists\mathbb{R}$-Complete
Training Fully Connected Neural Networks is ∃R\exists\mathbb{R}∃R-Complete
Daniel Bertschinger
Christoph Hertrich
Paul Jungeblut
Tillmann Miltzow
Simon Weber
OffRL
64
30
0
04 Apr 2022
Faster Perturbed Stochastic Gradient Methods for Finding Local Minima
Faster Perturbed Stochastic Gradient Methods for Finding Local Minima
Zixiang Chen
Dongruo Zhou
Quanquan Gu
43
1
0
25 Oct 2021
On the Global Convergence of Gradient Descent for multi-layer ResNets in
  the mean-field regime
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
MLT
AI4CE
43
11
0
06 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
Small random initialization is akin to spectral learning: Optimization
  and generalization guarantees for overparameterized low-rank matrix
  reconstruction
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction
Dominik Stöger
Mahdi Soltanolkotabi
ODL
42
75
0
28 Jun 2021
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix
  Factorization
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
Tian-Chun Ye
S. Du
21
46
0
27 Jun 2021
Escaping Saddle Points with Compressed SGD
Escaping Saddle Points with Compressed SGD
Dmitrii Avdiukhin
G. Yaroslavtsev
22
4
0
21 May 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
29
50
0
22 Feb 2021
Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex
  Optimization
Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization
Jun-Kun Wang
Jacob D. Abernethy
13
7
0
04 Oct 2020
Distributed Gradient Flow: Nonsmoothness, Nonconvexity, and Saddle Point
  Evasion
Distributed Gradient Flow: Nonsmoothness, Nonconvexity, and Saddle Point Evasion
Brian Swenson
Ryan W. Murray
H. Vincent Poor
S. Kar
17
16
0
12 Aug 2020
On the Almost Sure Convergence of Stochastic Gradient Descent in
  Non-Convex Problems
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems
P. Mertikopoulos
Nadav Hallak
Ali Kavis
V. Cevher
30
85
0
19 Jun 2020
First Order Methods take Exponential Time to Converge to Global
  Minimizers of Non-Convex Functions
First Order Methods take Exponential Time to Converge to Global Minimizers of Non-Convex Functions
Krishna Reddy Kesari
Jean Honorio
22
1
0
28 Feb 2020
On the distance between two neural networks and the stability of
  learning
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
Shadowing Properties of Optimization Algorithms
Shadowing Properties of Optimization Algorithms
Antonio Orvieto
Aurelien Lucchi
36
18
0
12 Nov 2019
Second-Order Guarantees of Stochastic Gradient Descent in Non-Convex
  Optimization
Second-Order Guarantees of Stochastic Gradient Descent in Non-Convex Optimization
Stefan Vlaski
Ali H. Sayed
ODL
37
21
0
19 Aug 2019
Distributed Learning in Non-Convex Environments -- Part II: Polynomial
  Escape from Saddle-Points
Distributed Learning in Non-Convex Environments -- Part II: Polynomial Escape from Saddle-Points
Stefan Vlaski
Ali H. Sayed
27
53
0
03 Jul 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
52
324
0
13 Jun 2019
Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization
Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization
Rong Ge
Zhize Li
Weiyao Wang
Xiang Wang
19
34
0
01 May 2019
A Deterministic Gradient-Based Approach to Avoid Saddle Points
A Deterministic Gradient-Based Approach to Avoid Saddle Points
L. Kreusser
Stanley J. Osher
Bao Wang
ODL
32
3
0
21 Jan 2019
Sharp Restricted Isometry Bounds for the Inexistence of Spurious Local
  Minima in Nonconvex Matrix Recovery
Sharp Restricted Isometry Bounds for the Inexistence of Spurious Local Minima in Nonconvex Matrix Recovery
Richard Y. Zhang
Somayeh Sojoudi
Javad Lavaei
11
51
0
07 Jan 2019
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
44
1,125
0
09 Nov 2018
Understanding the Acceleration Phenomenon via High-Resolution
  Differential Equations
Understanding the Acceleration Phenomenon via High-Resolution Differential Equations
Bin Shi
S. Du
Michael I. Jordan
Weijie J. Su
17
254
0
21 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning
Fault Tolerance in Iterative-Convergent Machine Learning
Aurick Qiao
Bryon Aragam
Bingjing Zhang
Eric Xing
26
41
0
17 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
65
1,252
0
04 Oct 2018
A theoretical framework for deep locally connected ReLU network
A theoretical framework for deep locally connected ReLU network
Yuandong Tian
PINN
25
10
0
28 Sep 2018
Optimistic mirror descent in saddle-point problems: Going the extra
  (gradient) mile
Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile
P. Mertikopoulos
Bruno Lecouat
Houssam Zenati
Chuan-Sheng Foo
V. Chandrasekhar
Georgios Piliouras
43
292
0
07 Jul 2018
Defending Against Saddle Point Attack in Byzantine-Robust Distributed
  Learning
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
Dong Yin
Yudong Chen
Kannan Ramchandran
Peter L. Bartlett
FedML
32
98
0
14 Jun 2018
An Information-Theoretic View for Deep Learning
An Information-Theoretic View for Deep Learning
Jingwei Zhang
Tongliang Liu
Dacheng Tao
MLT
FAtt
13
25
0
24 Apr 2018
Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase
  Procrustes Flow
Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow
Xiao Zhang
S. Du
Quanquan Gu
26
24
0
03 Mar 2018
Smoothed analysis for low-rank solutions to semidefinite programs in
  quadratic penalty form
Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form
Srinadh Bhojanapalli
Nicolas Boumal
Prateek Jain
Praneeth Netrapalli
29
44
0
01 Mar 2018
1