ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.12648
  4. Cited By
Glocal Smoothness: Line Search can really help!

Glocal Smoothness: Line Search can really help!

14 June 2025
Curtis Fox
Aaron Mishkin
Sharan Vaswani
Mark Schmidt
ArXiv (abs)PDFHTML

Papers citing "Glocal Smoothness: Line Search can really help!"

31 / 31 papers shown
Title
Gradient Methods with Online Scaling
Gradient Methods with Online Scaling
Wenzhi Gao
Ya-Chi Chu
Yinyu Ye
Madeleine Udell
78
4
0
04 Nov 2024
Adaptive Backtracking Line Search
Adaptive Backtracking Line Search
Joao V. Cavalcanti
Laurent Lessard
Ashia C. Wilson
29
0
0
23 Aug 2024
Empirical Tests of Optimization Assumptions in Deep Learning
Empirical Tests of Optimization Assumptions in Deep Learning
Hoang Tran
Qinzi Zhang
Ashok Cutkosky
64
2
0
01 Jul 2024
Why Line Search when you can Plane Search? SO-Friendly Neural Networks
  allow Per-Iteration Optimization of Learning and Momentum Rates for Every
  Layer
Why Line Search when you can Plane Search? SO-Friendly Neural Networks allow Per-Iteration Optimization of Learning and Momentum Rates for Every Layer
Betty Shea
Mark Schmidt
ODL
21
3
0
25 Jun 2024
Optimization on a Finer Scale: Bounded Local Subgradient Variation
  Perspective
Optimization on a Finer Scale: Bounded Local Subgradient Variation Perspective
Jelena Diakonikolas
Cristóbal Guzmán
415
2
0
24 Mar 2024
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Directional Smoothness and Gradient Methods: Convergence and Adaptivity
Aaron Mishkin
Ahmed Khaled
Yuanhao Wang
Aaron Defazio
Robert Mansel Gower
92
9
0
06 Mar 2024
Non-Uniform Smoothness for Gradient Descent
Non-Uniform Smoothness for Gradient Descent
A. Berahas
Lindon Roberts
Fred Roosta
76
4
0
15 Nov 2023
A simple uniformly optimal method without line search for convex
  optimization
A simple uniformly optimal method without line search for convex optimization
Tianjiao Li
Guanghui Lan
72
22
0
16 Oct 2023
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence
  and Variance Reduction
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
Xiao-Yan Jiang
Sebastian U. Stich
71
19
0
11 Aug 2023
Normalized Gradients for All
Normalized Gradients for All
Francesco Orabona
85
10
0
10 Aug 2023
Don't be so Monotone: Relaxing Stochastic Line Search in
  Over-Parameterized Models
Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Leonardo Galli
Holger Rauhut
Mark Schmidt
60
15
0
22 Jun 2023
Searching for Optimal Per-Coordinate Step-sizes with Multidimensional
  Backtracking
Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Frederik Kunstner
V. S. Portella
Mark Schmidt
Nick Harvey
73
9
0
05 Jun 2023
Accelerated first-order methods for convex optimization with locally
  Lipschitz continuous gradient
Accelerated first-order methods for convex optimization with locally Lipschitz continuous gradient
Zhaosong Lu
Sanyou Mei
24
7
0
02 Jun 2022
Special Properties of Gradient Descent with Large Learning Rates
Special Properties of Gradient Descent with Large Learning Rates
Amirkeivan Mohtashami
Martin Jaggi
Sebastian U. Stich
MLT
80
9
0
30 May 2022
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic
  Gradient Descent
Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent
Sharan Vaswani
Benjamin Dubois-Taine
Reza Babanezhad
74
12
0
21 Oct 2021
Leveraging Non-uniformity in First-order Non-convex Optimization
Leveraging Non-uniformity in First-order Non-convex Optimization
Jincheng Mei
Yue Gao
Bo Dai
Csaba Szepesvári
Dale Schuurmans
64
50
0
13 May 2021
Gradient Descent on Neural Networks Typically Occurs at the Edge of
  Stability
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Jeremy M. Cohen
Simran Kaur
Yuanzhi Li
J. Zico Kolter
Ameet Talwalkar
ODL
82
272
0
26 Feb 2021
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
Zheng Shi
Abdurakhmon Sadiev
Nicolas Loizou
Peter Richtárik
Martin Takávc
ODL
57
13
0
19 Feb 2021
Loss landscapes and optimization in over-parameterized non-linear
  systems and neural networks
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
77
262
0
29 Feb 2020
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
76
464
0
28 May 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and
  Convergence Rates
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Sharan Vaswani
Aaron Mishkin
I. Laradji
Mark Schmidt
Gauthier Gidel
Simon Lacoste-Julien
ODL
86
209
0
24 May 2019
On Matching Pursuit and Coordinate Descent
On Matching Pursuit and Coordinate Descent
Francesco Locatello
Anant Raj
Sai Praneeth Karimireddy
Gunnar Rätsch
Bernhard Schölkopf
Sebastian U. Stich
Martin Jaggi
39
23
0
26 Mar 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in
  Modern Over-parametrized Learning
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
Siyuan Ma
Raef Bassily
M. Belkin
79
289
0
18 Dec 2017
Convergence Rates for Deterministic and Stochastic Subgradient Methods
  Without Lipschitz Continuity
Convergence Rates for Deterministic and Stochastic Subgradient Methods Without Lipschitz Continuity
Benjamin Grimmer
52
41
0
12 Dec 2017
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
345
4,629
0
10 Nov 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the
  Polyak-Łojasiewicz Condition
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark Schmidt
280
1,220
0
16 Aug 2016
Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than
  Random Selection
Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection
J. Nutini
Mark Schmidt
I. Laradji
M. Friedlander
H. Koepke
75
223
0
01 Jun 2015
Semi-Stochastic Gradient Descent Methods
Semi-Stochastic Gradient Descent Methods
Jakub Konecný
Peter Richtárik
ODL
116
238
0
05 Dec 2013
Stochastic Gradient Descent, Weighted Sampling, and the Randomized
  Kaczmarz algorithm
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm
Deanna Needell
Nathan Srebro
Rachel A. Ward
149
554
0
21 Oct 2013
Minimizing Finite Sums with the Stochastic Average Gradient
Minimizing Finite Sums with the Stochastic Average Gradient
Mark Schmidt
Nicolas Le Roux
Francis R. Bach
324
1,249
0
10 Sep 2013
Piecewise linear regularized solution paths
Piecewise linear regularized solution paths
Saharon Rosset
Ji Zhu
533
521
0
16 Aug 2007
1