ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.06175
  4. Cited By
An Alternative View: When Does SGD Escape Local Minima?

An Alternative View: When Does SGD Escape Local Minima?

17 February 2018
Robert D. Kleinberg
Yuanzhi Li
Yang Yuan
    MLT
ArXivPDFHTML

Papers citing "An Alternative View: When Does SGD Escape Local Minima?"

19 / 69 papers shown
Title
FastGAE: Scalable Graph Autoencoders with Stochastic Subgraph Decoding
FastGAE: Scalable Graph Autoencoders with Stochastic Subgraph Decoding
Guillaume Salha-Galvan
Romain Hennequin
Jean-Baptiste Remy
Manuel Moussallam
Michalis Vazirgiannis
GNN
BDL
36
6
0
05 Feb 2020
A frequency-domain analysis of inexact gradient methods
A frequency-domain analysis of inexact gradient methods
Oran Gannot
24
25
0
31 Dec 2019
Towards Better Understanding of Adaptive Gradient Algorithms in
  Generative Adversarial Nets
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets
Mingrui Liu
Youssef Mroueh
Jerret Ross
Wei Zhang
Xiaodong Cui
Payel Das
Tianbao Yang
ODL
46
63
0
26 Dec 2019
Stochastic gradient descent for hybrid quantum-classical optimization
Stochastic gradient descent for hybrid quantum-classical optimization
R. Sweke
Frederik Wilde
Johannes Jakob Meyer
Maria Schuld
Paul K. Fährmann
Barthélémy Meynard-Piganeau
Jens Eisert
17
236
0
02 Oct 2019
Stochastic AUC Maximization with Deep Neural Networks
Stochastic AUC Maximization with Deep Neural Networks
Mingrui Liu
Zhuoning Yuan
Yiming Ying
Tianbao Yang
27
103
0
28 Aug 2019
How Does Learning Rate Decay Help Modern Neural Networks?
How Does Learning Rate Decay Help Modern Neural Networks?
Kaichao You
Mingsheng Long
Jianmin Wang
Michael I. Jordan
30
4
0
05 Aug 2019
On the Noisy Gradient Descent that Generalizes as SGD
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
24
10
0
18 Jun 2019
Langevin Monte Carlo without smoothness
Langevin Monte Carlo without smoothness
Niladri S. Chatterji
Jelena Diakonikolas
Michael I. Jordan
Peter L. Bartlett
BDL
31
43
0
30 May 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and
  Convergence Rates
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Sharan Vaswani
Aaron Mishkin
I. Laradji
Mark Schmidt
Gauthier Gidel
Simon Lacoste-Julien
ODL
50
205
0
24 May 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Haowei He
Gao Huang
Yang Yuan
ODL
MLT
28
148
0
02 Feb 2019
An Investigation into Neural Net Optimization via Hessian Eigenvalue
  Density
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Behrooz Ghorbani
Shankar Krishnan
Ying Xiao
ODL
18
317
0
29 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
Yi Zhou
Junjie Yang
Huishuai Zhang
Yingbin Liang
Vahid Tarokh
22
71
0
02 Jan 2019
An Empirical Study of Example Forgetting during Deep Neural Network
  Learning
An Empirical Study of Example Forgetting during Deep Neural Network Learning
Mariya Toneva
Alessandro Sordoni
Rémi Tachet des Combes
Adam Trischler
Yoshua Bengio
Geoffrey J. Gordon
48
715
0
12 Dec 2018
Stagewise Training Accelerates Convergence of Testing Error Over SGD
Stagewise Training Accelerates Convergence of Testing Error Over SGD
Zhuoning Yuan
Yan Yan
Rong Jin
Tianbao Yang
60
11
0
10 Dec 2018
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
41
765
0
12 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an
  Accelerated Perceptron
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
Sharan Vaswani
Francis R. Bach
Mark Schmidt
30
296
0
16 Oct 2018
On the Learning Dynamics of Deep Neural Networks
On the Learning Dynamics of Deep Neural Networks
Rémi Tachet des Combes
Mohammad Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
21
38
0
18 Sep 2018
On the Local Minima of the Empirical Risk
On the Local Minima of the Empirical Risk
Chi Jin
Lydia T. Liu
Rong Ge
Michael I. Jordan
FedML
24
56
0
25 Mar 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
312
2,896
0
15 Sep 2016
Previous
12