An Alternative View: When Does SGD Escape Local Minima?

An Alternative View: When Does SGD Escape Local Minima?

17 February 2018

Robert D. Kleinberg

Papers citing "An Alternative View: When Does SGD Escape Local Minima?"

19 / 69 papers shown

Title
FastGAE: Scalable Graph Autoencoders with Stochastic Subgraph Decoding Guillaume Salha-Galvan Romain Hennequin Jean-Baptiste Remy Manuel Moussallam Michalis Vazirgiannis GNN BDL 36 6 0 05 Feb 2020
A frequency-domain analysis of inexact gradient methods Oran Gannot 24 25 0 31 Dec 2019
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets Mingrui Liu Youssef Mroueh Jerret Ross Wei Zhang Xiaodong Cui Payel Das Tianbao Yang ODL 46 63 0 26 Dec 2019
Stochastic gradient descent for hybrid quantum-classical optimization R. Sweke Frederik Wilde Johannes Jakob Meyer Maria Schuld Paul K. Fährmann Barthélémy Meynard-Piganeau Jens Eisert 17 236 0 02 Oct 2019
Stochastic AUC Maximization with Deep Neural Networks Mingrui Liu Zhuoning Yuan Yiming Ying Tianbao Yang 27 103 0 28 Aug 2019
How Does Learning Rate Decay Help Modern Neural Networks? Kaichao You Mingsheng Long Jianmin Wang Michael I. Jordan 30 4 0 05 Aug 2019
On the Noisy Gradient Descent that Generalizes as SGD Jingfeng Wu Wenqing Hu Haoyi Xiong Jun Huan Vladimir Braverman Zhanxing Zhu MLT 24 10 0 18 Jun 2019
Langevin Monte Carlo without smoothness Niladri S. Chatterji Jelena Diakonikolas Michael I. Jordan Peter L. Bartlett BDL 31 43 0 30 May 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates Sharan Vaswani Aaron Mishkin I. Laradji Mark Schmidt Gauthier Gidel Simon Lacoste-Julien ODL 50 205 0 24 May 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima Haowei He Gao Huang Yang Yuan ODL MLT 28 148 0 02 Feb 2019
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density Behrooz Ghorbani Shankar Krishnan Ying Xiao ODL 18 317 0 29 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path Yi Zhou Junjie Yang Huishuai Zhang Yingbin Liang Vahid Tarokh 22 71 0 02 Jan 2019
An Empirical Study of Example Forgetting during Deep Neural Network Learning Mariya Toneva Alessandro Sordoni Rémi Tachet des Combes Adam Trischler Yoshua Bengio Geoffrey J. Gordon 48 715 0 12 Dec 2018
Stagewise Training Accelerates Convergence of Testing Error Over SGD Zhuoning Yuan Yan Yan Rong Jin Tianbao Yang 60 11 0 10 Dec 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers Zeyuan Allen-Zhu Yuanzhi Li Yingyu Liang MLT 41 765 0 12 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron Sharan Vaswani Francis R. Bach Mark Schmidt 30 296 0 16 Oct 2018
On the Learning Dynamics of Deep Neural Networks Rémi Tachet des Combes Mohammad Pezeshki Samira Shabanian Aaron Courville Yoshua Bengio 21 38 0 18 Sep 2018
On the Local Minima of the Empirical Risk Chi Jin Lydia T. Liu Rong Ge Michael I. Jordan FedML 24 56 0 25 Mar 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 312 2,896 0 15 Sep 2016