Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.12648
Cited By
Glocal Smoothness: Line Search can really help!
14 June 2025
Curtis Fox
Aaron Mishkin
Sharan Vaswani
Mark Schmidt
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Glocal Smoothness: Line Search can really help!"
31 / 31 papers shown
Title
Gradient Methods with Online Scaling
Wenzhi Gao
Ya-Chi Chu
Yinyu Ye
Madeleine Udell
78
4
0
04 Nov 2024
Adaptive Backtracking Line Search
Joao V. Cavalcanti
Laurent Lessard
Ashia C. Wilson
29
0
0
23 Aug 2024
Empirical Tests of Optimization Assumptions in Deep Learning
Hoang Tran
Qinzi Zhang
Ashok Cutkosky
64
2
0
01 Jul 2024
Why Line Search when you can Plane Search? SO-Friendly Neural Networks allow Per-Iteration Optimization of Learning and Momentum Rates for Every Layer
Betty Shea
Mark Schmidt
ODL
21
3
0
25 Jun 2024
Optimization on a Finer Scale: Bounded Local Subgradient Variation Perspective
Jelena Diakonikolas
Cristóbal Guzmán
415
2
0
24 Mar 2024
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Aaron Mishkin
Ahmed Khaled
Yuanhao Wang
Aaron Defazio
Robert Mansel Gower
92
9
0
06 Mar 2024
Non-Uniform Smoothness for Gradient Descent
A. Berahas
Lindon Roberts
Fred Roosta
76
4
0
15 Nov 2023
A simple uniformly optimal method without line search for convex optimization
Tianjiao Li
Guanghui Lan
72
22
0
16 Oct 2023
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
Xiao-Yan Jiang
Sebastian U. Stich
71
19
0
11 Aug 2023
Normalized Gradients for All
Francesco Orabona
85
10
0
10 Aug 2023
Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Leonardo Galli
Holger Rauhut
Mark Schmidt
60
15
0
22 Jun 2023
Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Frederik Kunstner
V. S. Portella
Mark Schmidt
Nick Harvey
73
9
0
05 Jun 2023
Accelerated first-order methods for convex optimization with locally Lipschitz continuous gradient
Zhaosong Lu
Sanyou Mei
24
7
0
02 Jun 2022
Special Properties of Gradient Descent with Large Learning Rates
Amirkeivan Mohtashami
Martin Jaggi
Sebastian U. Stich
MLT
80
9
0
30 May 2022
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Sharan Vaswani
Benjamin Dubois-Taine
Reza Babanezhad
74
12
0
21 Oct 2021
Leveraging Non-uniformity in First-order Non-convex Optimization
Jincheng Mei
Yue Gao
Bo Dai
Csaba Szepesvári
Dale Schuurmans
64
50
0
13 May 2021
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Jeremy M. Cohen
Simran Kaur
Yuanzhi Li
J. Zico Kolter
Ameet Talwalkar
ODL
82
272
0
26 Feb 2021
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
Zheng Shi
Abdurakhmon Sadiev
Nicolas Loizou
Peter Richtárik
Martin Takávc
ODL
57
13
0
19 Feb 2021
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
77
262
0
29 Feb 2020
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
76
464
0
28 May 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Sharan Vaswani
Aaron Mishkin
I. Laradji
Mark Schmidt
Gauthier Gidel
Simon Lacoste-Julien
ODL
86
209
0
24 May 2019
On Matching Pursuit and Coordinate Descent
Francesco Locatello
Anant Raj
Sai Praneeth Karimireddy
Gunnar Rätsch
Bernhard Schölkopf
Sebastian U. Stich
Martin Jaggi
39
23
0
26 Mar 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
Siyuan Ma
Raef Bassily
M. Belkin
79
289
0
18 Dec 2017
Convergence Rates for Deterministic and Stochastic Subgradient Methods Without Lipschitz Continuity
Benjamin Grimmer
52
41
0
12 Dec 2017
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
345
4,629
0
10 Nov 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark Schmidt
280
1,220
0
16 Aug 2016
Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection
J. Nutini
Mark Schmidt
I. Laradji
M. Friedlander
H. Koepke
75
223
0
01 Jun 2015
Semi-Stochastic Gradient Descent Methods
Jakub Konecný
Peter Richtárik
ODL
116
238
0
05 Dec 2013
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm
Deanna Needell
Nathan Srebro
Rachel A. Ward
149
554
0
21 Oct 2013
Minimizing Finite Sums with the Stochastic Average Gradient
Mark Schmidt
Nicolas Le Roux
Francis R. Bach
324
1,249
0
10 Sep 2013
Piecewise linear regularized solution paths
Saharon Rosset
Ji Zhu
533
521
0
16 Aug 2007
1