Gradient Descent Can Take Exponential Time to Escape Saddle Points

29 May 2017

Aarti Singh

Papers citing "Gradient Descent Can Take Exponential Time to Escape Saddle Points"

48 / 48 papers shown

Title
Nesterov acceleration in benignly non-convex landscapes Kanan Gupta Stephan Wojtowytsch 42 2 0 10 Oct 2024
Mask in the Mirror: Implicit Sparsification Tom Jacobs R. Burkholz 49 3 0 19 Aug 2024
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization Nuoya Xiong Lijun Ding Simon S. Du 48 12 0 03 Oct 2023
Memory-Query Tradeoffs for Randomized Convex Optimization Xinyu Chen Binghui Peng 36 6 0 21 Jun 2023
Almost Sure Saddle Avoidance of Stochastic Gradient Methods without the Bounded Gradient Assumption Jun Liu Ye Yuan ODL 19 1 0 15 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 25 7 0 03 Feb 2023
Exploring the Effect of Multi-step Ascent in Sharpness-Aware Minimization Hoki Kim Jinseong Park Yujin Choi Woojin Lee Jaewook Lee 20 9 0 27 Jan 2023
Stability Analysis of Sharpness-Aware Minimization Hoki Kim Jinseong Park Yujin Choi Jaewook Lee 39 12 0 16 Jan 2023
Decentralized Nonconvex Optimization with Guaranteed Privacy and Accuracy Yongqiang Wang Tamer Basar 31 21 0 14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points Mayank Baranwal Param Budhraja V. Raj A. Hota 33 2 0 07 Dec 2022
Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points R. Tappenden Martin Takáč 18 0 0 02 Nov 2022
Stochastic noise can be helpful for variational quantum algorithms Junyu Liu Frederik Wilde A. A. Mele Liang Jiang Jens Eisert Jens Eisert 26 34 0 13 Oct 2022
Zeroth-Order Negative Curvature Finding: Escaping Saddle Points without Gradients Hualin Zhang Huan Xiong Bin Gu 35 7 0 04 Oct 2022
Nonconvex Matrix Factorization is Geodesically Convex: Global Landscape Analysis for Fixed-rank Matrix Optimization From a Riemannian Perspective Yuetian Luo Nicolas García Trillos 24 6 0 29 Sep 2022
Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold Can Yaras Peng Wang Zhihui Zhu Laura Balzano Qing Qu 25 42 0 19 Sep 2022
Gradient descent provably escapes saddle points in the training of shallow ReLU networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 36 5 0 03 Aug 2022
Gradient Descent, Stochastic Optimization, and Other Tales Jun Lu 22 8 0 02 May 2022
Randomly Initialized Alternating Least Squares: Fast Convergence for Matrix Sensing Kiryung Lee Dominik Stöger 31 11 0 25 Apr 2022
$Training Fully Connected Neural Networks is $\exists\mathbb{R}$-Complete$ Training Fully Connected Neural Networks is $\exists\mathbb{R}$ -Complete Daniel Bertschinger Christoph Hertrich Paul Jungeblut Tillmann Miltzow Simon Weber OffRL 64 30 0 04 Apr 2022
Faster Perturbed Stochastic Gradient Methods for Finding Local Minima Zixiang Chen Dongruo Zhou Quanquan Gu 43 1 0 25 Oct 2021
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime Zhiyan Ding Shi Chen Qin Li S. Wright MLT AI4CE 43 11 0 06 Oct 2021
Stochastic Training is Not Necessary for Generalization Jonas Geiping Micah Goldblum Phillip E. Pope Michael Moeller Tom Goldstein 89 72 0 29 Sep 2021
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction Dominik Stöger Mahdi Soltanolkotabi ODL 42 75 0 28 Jun 2021
Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization Tian-Chun Ye S. Du 21 46 0 27 Jun 2021
Escaping Saddle Points with Compressed SGD Dmitrii Avdiukhin G. Yaroslavtsev 22 4 0 21 May 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge Gen Li Yuting Wei Yuejie Chi Yuxin Chen 29 50 0 22 Feb 2021
Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization Jun-Kun Wang Jacob D. Abernethy 13 7 0 04 Oct 2020
Distributed Gradient Flow: Nonsmoothness, Nonconvexity, and Saddle Point Evasion Brian Swenson Ryan W. Murray H. Vincent Poor S. Kar 17 16 0 12 Aug 2020
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems P. Mertikopoulos Nadav Hallak Ali Kavis V. Cevher 30 85 0 19 Jun 2020
First Order Methods take Exponential Time to Converge to Global Minimizers of Non-Convex Functions Krishna Reddy Kesari Jean Honorio 22 1 0 28 Feb 2020
On the distance between two neural networks and the stability of learning Jeremy Bernstein Arash Vahdat Yisong Yue Xuan Li ODL 200 57 0 09 Feb 2020
Shadowing Properties of Optimization Algorithms Antonio Orvieto Aurelien Lucchi 36 18 0 12 Nov 2019
Second-Order Guarantees of Stochastic Gradient Descent in Non-Convex Optimization Stefan Vlaski Ali H. Sayed ODL 37 21 0 19 Aug 2019
Distributed Learning in Non-Convex Environments -- Part II: Polynomial Escape from Saddle-Points Stefan Vlaski Ali H. Sayed 27 53 0 03 Jul 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks Kaifeng Lyu Jian Li 52 324 0 13 Jun 2019
Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization Rong Ge Zhize Li Weiyao Wang Xiang Wang 19 34 0 01 May 2019
A Deterministic Gradient-Based Approach to Avoid Saddle Points L. Kreusser Stanley J. Osher Bao Wang ODL 32 3 0 21 Jan 2019
Sharp Restricted Isometry Bounds for the Inexistence of Spurious Local Minima in Nonconvex Matrix Recovery Richard Y. Zhang Somayeh Sojoudi Javad Lavaei 11 51 0 07 Jan 2019
Gradient Descent Finds Global Minima of Deep Neural Networks S. Du Jason D. Lee Haochuan Li Liwei Wang Masayoshi Tomizuka ODL 44 1,125 0 09 Nov 2018
Understanding the Acceleration Phenomenon via High-Resolution Differential Equations Bin Shi S. Du Michael I. Jordan Weijie J. Su 17 254 0 21 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning Aurick Qiao Bryon Aragam Bingjing Zhang Eric Xing 26 41 0 17 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 65 1,252 0 04 Oct 2018
A theoretical framework for deep locally connected ReLU network Yuandong Tian PINN 25 10 0 28 Sep 2018
Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile P. Mertikopoulos Bruno Lecouat Houssam Zenati Chuan-Sheng Foo V. Chandrasekhar Georgios Piliouras 43 292 0 07 Jul 2018
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning Dong Yin Yudong Chen Kannan Ramchandran Peter L. Bartlett FedML 32 98 0 14 Jun 2018
An Information-Theoretic View for Deep Learning Jingwei Zhang Tongliang Liu Dacheng Tao MLT FAtt 13 25 0 24 Apr 2018
Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow Xiao Zhang S. Du Quanquan Gu 26 24 0 03 Mar 2018
Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form Srinadh Bhojanapalli Nicolas Boumal Prateek Jain Praneeth Netrapalli 29 44 0 01 Mar 2018