Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.04831
Cited By
The Power of Normalization: Faster Evasion of Saddle Points
15 November 2016
Kfir Y. Levy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Power of Normalization: Faster Evasion of Saddle Points"
19 / 69 papers shown
Title
Escaping Saddles with Stochastic Gradients
Hadi Daneshmand
Jonas Köhler
Aurelien Lucchi
Thomas Hofmann
24
161
0
15 Mar 2018
Convergence of Gradient Descent on Separable Data
Mor Shpigel Nacson
J. Lee
Suriya Gunasekar
Pedro H. P. Savarese
Nathan Srebro
Daniel Soudry
8
163
0
05 Mar 2018
On the Power of Over-parametrization in Neural Networks with Quadratic Activation
S. Du
J. Lee
27
267
0
03 Mar 2018
Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima
Yaodong Yu
Pan Xu
Quanquan Gu
6
3
0
18 Dec 2017
Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently
Yaodong Yu
Difan Zou
Quanquan Gu
19
10
0
11 Dec 2017
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
S. Du
J. Lee
Yuandong Tian
Barnabás Póczós
Aarti Singh
MLT
29
235
0
03 Dec 2017
First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time
Yi Tian Xu
Rong Jin
Tianbao Yang
ODL
22
116
0
03 Nov 2017
A Generic Approach for Escaping Saddle points
Sashank J. Reddi
Manzil Zaheer
S. Sra
Barnabás Póczós
Francis R. Bach
Ruslan Salakhutdinov
Alex Smola
18
83
0
05 Sep 2017
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study
Peng Xu
Farbod Roosta-Khorasani
Michael W. Mahoney
ODL
19
143
0
25 Aug 2017
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information
Peng Xu
Farbod Roosta-Khorasani
Michael W. Mahoney
34
210
0
23 Aug 2017
Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization
Pan Xu
Jinghui Chen
Difan Zou
Quanquan Gu
31
200
0
20 Jul 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
J. Lee
36
415
0
16 Jul 2017
Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network
Adams Wei Yu
Lei Huang
Qihang Lin
Ruslan Salakhutdinov
J. Carbonell
ODL
10
24
0
16 Jul 2017
Sampling Matters in Deep Embedding Learning
Chaoxia Wu
R. Manmatha
Alex Smola
Philipp Krahenbuhl
25
919
0
23 Jun 2017
Online to Offline Conversions, Universality and Adaptive Minibatch Sizes
Kfir Y. Levy
ODL
25
57
0
30 May 2017
Gradient Descent Can Take Exponential Time to Escape Saddle Points
S. Du
Chi Jin
J. Lee
Michael I. Jordan
Barnabás Póczós
Aarti Singh
16
244
0
29 May 2017
How to Escape Saddle Points Efficiently
Chi Jin
Rong Ge
Praneeth Netrapalli
Sham Kakade
Michael I. Jordan
ODL
37
831
0
02 Mar 2017
Fast Rates for Empirical Risk Minimization of Strict Saddle Problems
Alon Gonen
Shai Shalev-Shwartz
41
30
0
16 Jan 2017
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
183
1,185
0
30 Nov 2014
Previous
1
2