Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.04831
Cited By
The Power of Normalization: Faster Evasion of Saddle Points
15 November 2016
Kfir Y. Levy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Power of Normalization: Faster Evasion of Saddle Points"
50 / 69 papers shown
Title
Smoothed Normalization for Efficient Distributed Private Optimization
Egor Shulgin
Sarit Khirirat
Peter Richtárik
FedML
87
0
0
20 Feb 2025
From Gradient Clipping to Normalization for Heavy Tailed SGD
Florian Hübler
Ilyas Fatkhullin
Niao He
40
5
0
17 Oct 2024
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds
Daniel Dodd
Louis Sharrock
Christopher Nemeth
41
0
0
04 Jun 2024
Series of Hessian-Vector Products for Tractable Saddle-Free Newton Optimisation of Neural Networks
E. T. Oldewage
Ross M. Clarke
José Miguel Hernández-Lobato
ODL
25
1
0
23 Oct 2023
Lion Secretly Solves Constrained Optimization: As Lyapunov Predicts
Lizhang Chen
Bo Liu
Kaizhao Liang
Qian Liu
ODL
27
15
0
09 Oct 2023
Toward Understanding Why Adam Converges Faster Than SGD for Transformers
Yan Pan
Yuanzhi Li
33
41
0
31 May 2023
Understanding Predictive Coding as an Adaptive Trust-Region Method
Francesco Innocenti
Ryan Singh
Christopher L. Buckley
23
0
0
29 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
42
15
0
21 May 2023
Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees
Anastasia Koloskova
Hadrien Hendrikx
Sebastian U. Stich
112
49
0
02 May 2023
EPISODE: Episodic Gradient Clipping with Periodic Resampled Corrections for Federated Learning with Heterogeneous Data
M. Crawshaw
Yajie Bao
Mingrui Liu
FedML
27
8
0
14 Feb 2023
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurelien Lucchi
25
13
0
19 Jan 2023
Mitigating Memorization of Noisy Labels by Clipping the Model Prediction
Hongxin Wei
Huiping Zhuang
Renchunzi Xie
Lei Feng
Gang Niu
Bo An
Yixuan Li
VLM
NoLa
26
29
0
08 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
33
2
0
07 Dec 2022
Escaping From Saddle Points Using Asynchronous Coordinate Gradient Descent
Marco Bornstein
Jin-Peng Liu
Jingling Li
Furong Huang
21
0
0
17 Nov 2022
Dissecting adaptive methods in GANs
Samy Jelassi
David Dobre
A. Mensch
Yuanzhi Li
Gauthier Gidel
24
4
0
09 Oct 2022
A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks
Mingrui Liu
Zhenxun Zhuang
Yunwei Lei
Chunyang Liao
38
16
0
10 May 2022
Robust Training of Neural Networks Using Scale Invariant Architectures
Zhiyuan Li
Srinadh Bhojanapalli
Manzil Zaheer
Sashank J. Reddi
Surinder Kumar
29
27
0
02 Feb 2022
On the Second-order Convergence Properties of Random Search Methods
Aurelien Lucchi
Antonio Orvieto
Adamos Solomou
18
8
0
25 Oct 2021
Robust Distributed Optimization With Randomly Corrupted Gradients
Berkay Turan
César A. Uribe
Hoi-To Wai
M. Alizadeh
17
16
0
28 Jun 2021
Backward Gradient Normalization in Deep Neural Networks
Alejandro Cabana
Luis F. Lago-Fernández
ODL
10
2
0
17 Jun 2021
Escaping Saddle Points Faster with Stochastic Momentum
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
ODL
11
22
0
05 Jun 2021
On the Differentially Private Nature of Perturbed Gradient Descent
Thulasi Tholeti
Sheetal Kalyani
13
1
0
18 Jan 2021
Recent Theoretical Advances in Non-Convex Optimization
Marina Danilova
Pavel Dvurechensky
Alexander Gasnikov
Eduard A. Gorbunov
Sergey Guminov
Dmitry Kamzolov
Innokentiy Shibaev
33
77
0
11 Dec 2020
A One-bit, Comparison-Based Gradient Estimator
HanQin Cai
Daniel McKenzie
W. Yin
Zhenliang Zhang
30
17
0
06 Oct 2020
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Bohang Zhang
Jikai Jin
Cong Fang
Liwei Wang
38
87
0
05 Oct 2020
Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization
Jun-Kun Wang
Jacob D. Abernethy
11
7
0
04 Oct 2020
Binary Search and First Order Gradient Based Method for Stochastic Optimization
V. Pandey
ODL
8
0
0
27 Jul 2020
Quantum algorithms for escaping from saddle points
Chenyi Zhang
Jiaqi Leng
Tongyang Li
13
19
0
20 Jul 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
39
275
0
01 Jun 2020
Online non-convex learning for river pollution source identification
Wenjie Huang
Jing Jiang
Xiao Liu
12
3
0
22 May 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping
Eduard A. Gorbunov
Marina Danilova
Alexander Gasnikov
13
115
0
21 May 2020
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
Esteban Real
Chen Liang
David R. So
Quoc V. Le
39
220
0
06 Mar 2020
The Geometry of Sign Gradient Descent
Lukas Balles
Fabian Pedregosa
Nicolas Le Roux
ODL
21
23
0
19 Feb 2020
Why are Adaptive Methods Good for Attention Models?
J.N. Zhang
Sai Praneeth Karimireddy
Andreas Veit
Seungyeon Kim
Sashank J. Reddi
Surinder Kumar
S. Sra
12
79
0
06 Dec 2019
Stationary Points of Shallow Neural Networks with Quadratic Activation Function
D. Gamarnik
Eren C. Kizildag
Ilias Zadik
19
13
0
03 Dec 2019
Shadowing Properties of Optimization Algorithms
Antonio Orvieto
Aurelien Lucchi
33
18
0
12 Nov 2019
Efficiently avoiding saddle points with zero order methods: No gradients required
Lampros Flokas
Emmanouil-Vasileios Vlatakis-Gkaragkounis
Georgios Piliouras
22
32
0
29 Oct 2019
Extending the step-size restriction for gradient descent to avoid strict saddle points
Hayden Schaeffer
S. McCalla
18
4
0
05 Aug 2019
Efficiently escaping saddle points on manifolds
Christopher Criscitiello
Nicolas Boumal
22
62
0
10 Jun 2019
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
30
445
0
28 May 2019
On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics
Xi Chen
S. Du
Xin T. Tong
28
33
0
30 Apr 2019
On Nonconvex Optimization for Machine Learning: Gradients, Stochasticity, and Saddle Points
Chi Jin
Praneeth Netrapalli
Rong Ge
Sham Kakade
Michael I. Jordan
24
61
0
13 Feb 2019
Escaping Saddle Points with Adaptive Gradient Methods
Matthew Staib
Sashank J. Reddi
Satyen Kale
Sanjiv Kumar
S. Sra
ODL
14
73
0
26 Jan 2019
A Deterministic Gradient-Based Approach to Avoid Saddle Points
L. Kreusser
Stanley J. Osher
Bao Wang
ODL
32
3
0
21 Jan 2019
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Yuejie Chi
Yue M. Lu
Yuxin Chen
39
416
0
25 Sep 2018
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Cong Fang
C. J. Li
Zhouchen Lin
Tong Zhang
50
570
0
04 Jul 2018
Finding Local Minima via Stochastic Nested Variance Reduction
Dongruo Zhou
Pan Xu
Quanquan Gu
14
23
0
22 Jun 2018
Stochastic Nested Variance Reduction for Nonconvex Optimization
Dongruo Zhou
Pan Xu
Quanquan Gu
25
146
0
20 Jun 2018
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
Dong Yin
Yudong Chen
Kannan Ramchandran
Peter L. Bartlett
FedML
32
97
0
14 Jun 2018
Adaptive Stochastic Gradient Langevin Dynamics: Taming Convergence and Saddle Point Escape Time
Hejian Sang
Jia-Wei Liu
ODL
16
1
0
23 May 2018
1
2
Next