Gradient Descent Converges to Minimizers

16 February 2016

J. Lee

Max Simchowitz

Michael I. Jordan

Benjamin Recht

ArXiv PDF HTML

Papers citing "Gradient Descent Converges to Minimizers"

43 / 43 papers shown

Title
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture Yikun Hou Suvrit Sra A. Yurtsever 34 0 0 28 Jan 2025
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 45 0 0 08 Feb 2024
Offline Policy Evaluation and Optimization under Confounding Chinmaya Kausik Yangyi Lu Kevin Tan Maggie Makar Yixin Wang Ambuj Tewari OffRL 23 8 0 29 Nov 2022
Stochastic noise can be helpful for variational quantum algorithms Junyu Liu Frederik Wilde A. A. Mele Liang Jiang Jens Eisert Jens Eisert 24 34 0 13 Oct 2022
CoShNet: A Hybrid Complex Valued Neural Network using Shearlets Manny Ko Ujjawal K. Panchal Héctor Andrade-Loarca Andres Mendez-Vazquez 27 1 0 14 Aug 2022
Optimal Rate Adaption in Federated Learning with Compressed Communications Laizhong Cui Xiaoxin Su Yipeng Zhou Jiangchuan Liu FedML 42 38 0 13 Dec 2021
A Survey on Fault-tolerance in Distributed Optimization and Machine Learning Shuo Liu AI4CE OOD 50 13 0 16 Jun 2021
Learning explanations that are hard to vary Giambattista Parascandolo Alexander Neitz Antonio Orvieto Luigi Gresele Bernhard Schölkopf FAtt 19 178 0 01 Sep 2020
Learning from Sparse Demonstrations Wanxin Jin Todd D. Murphey Dana Kulić Neta Ezer Shaoshuai Mou 19 35 0 05 Aug 2020
On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them Chen Liu Mathieu Salzmann Tao R. Lin Ryota Tomioka Sabine Süsstrunk AAML 24 81 0 15 Jun 2020
Implicit Geometric Regularization for Learning Shapes Amos Gropp Lior Yariv Niv Haim Matan Atzmon Y. Lipman AI4CE 45 852 0 24 Feb 2020
$Depth Descent Synchronization in $\mathrm{SO}(D)$$ Depth Descent Synchronization in $\mathrm{SO}(D)$ Tyler Maunu Gilad Lerman MDE 34 2 0 13 Feb 2020
On the Sample Complexity and Optimization Landscape for Quadratic Feasibility Problems Parth Thaker Gautam Dasarathy Angelia Nedić 21 5 0 04 Feb 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity Shiyu Liang Ruoyu Sun R. Srikant 35 19 0 31 Dec 2019
Shadowing Properties of Optimization Algorithms Antonio Orvieto Aurelien Lucchi 30 18 0 12 Nov 2019
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape Johanni Brea Berfin Simsek Bernd Illing W. Gerstner 23 55 0 05 Jul 2019
Combining Stochastic Adaptive Cubic Regularization with Negative Curvature for Nonconvex Optimization Seonho Park Seung Hyun Jung P. Pardalos ODL 21 15 0 27 Jun 2019
Differentiable Game Mechanics Alistair Letcher David Balduzzi S. Racanière James Martens Jakob N. Foerster K. Tuyls T. Graepel 37 79 0 13 May 2019
A Deterministic Gradient-Based Approach to Avoid Saddle Points L. Kreusser Stanley J. Osher Bao Wang ODL 26 3 0 21 Jan 2019
Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks Mouloud Belbahri Eyyub Sari Sajad Darabi V. Nia MQ 21 1 0 18 Jan 2019
Gradient descent aligns the layers of deep linear networks Ziwei Ji Matus Telgarsky 14 248 0 04 Oct 2018
On the Implicit Bias of Dropout Poorya Mianjy R. Arora René Vidal 27 66 0 26 Jun 2018
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning Dong Yin Yudong Chen Kannan Ramchandran Peter L. Bartlett FedML 29 97 0 14 Jun 2018
Local Saddle Point Optimization: A Curvature Exploitation Approach Leonard Adolphs Hadi Daneshmand Aurelien Lucchi Thomas Hofmann 26 107 0 15 May 2018
Comparing Dynamics: Deep Neural Networks versus Glassy Systems Marco Baity-Jesi Levent Sagun Mario Geiger S. Spigler Gerard Ben Arous C. Cammarota Yann LeCun M. Wyart Giulio Biroli AI4CE 33 113 0 19 Mar 2018
Escaping Saddles with Stochastic Gradients Hadi Daneshmand Jonas Köhler Aurelien Lucchi Thomas Hofmann 19 161 0 15 Mar 2018
The Mechanics of n-Player Differentiable Games David Balduzzi S. Racanière James Martens Jakob N. Foerster K. Tuyls T. Graepel MLT 16 273 0 15 Feb 2018
Improving Generalization Performance by Switching from Adam to SGD N. Keskar R. Socher ODL 29 520 0 20 Dec 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks Mahdi Soltanolkotabi Adel Javanmard J. Lee 36 415 0 16 Jul 2017
Fast Rates for Empirical Risk Minimization of Strict Saddle Problems Alon Gonen Shai Shalev-Shwartz 33 29 0 16 Jan 2017
Symmetry, Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization Xingguo Li Junwei Lu R. Arora Jarvis Haupt Han Liu Zhaoran Wang T. Zhao 37 52 0 29 Dec 2016
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond Levent Sagun Léon Bottou Yann LeCun UQCV 29 226 0 22 Nov 2016
Topology and Geometry of Half-Rectified Network Optimization C. Freeman Joan Bruna 19 233 0 04 Nov 2016
Asynchronous Stochastic Gradient Descent with Delay Compensation Shuxin Zheng Qi Meng Taifeng Wang Wei Chen Nenghai Yu Zhiming Ma Tie-Yan Liu 21 311 0 27 Sep 2016
Stochastic Heavy Ball S. Gadat Fabien Panloup Sofiane Saadane 15 103 0 14 Sep 2016
Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach Dohyung Park Anastasios Kyrillidis C. Caramanis Sujay Sanghavi 17 179 0 12 Sep 2016
Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences Chi Jin Yuchen Zhang Sivaraman Balakrishnan Martin J. Wainwright Michael I. Jordan 32 131 0 04 Sep 2016
Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent Chi Jin Sham Kakade Praneeth Netrapalli 13 81 0 26 May 2016
No bad local minima: Data independent training error guarantees for multilayer neural networks Daniel Soudry Y. Carmon 19 235 0 26 May 2016
Matrix Completion has No Spurious Local Minimum Rong Ge J. Lee Tengyu Ma 13 596 0 24 May 2016
On the Powerball Method for Optimization Ye Yuan Mu Li Jun Liu Claire Tomlin 16 20 0 24 Mar 2016
When Are Nonconvex Problems Not Scary? Ju Sun Qing Qu John N. Wright 21 166 0 21 Oct 2015
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 183 1,185 0 30 Nov 2014