34
0

Gradient Methods with Online Scaling Part I. Theoretical Foundations

Main:19 Pages
4 Figures
Bibliography:4 Pages
5 Tables
Appendix:17 Pages
Abstract

This paper establishes the theoretical foundations of the online scaled gradient methods (OSGM), a framework that utilizes online learning to adapt stepsizes and provably accelerate first-order methods. OSGM quantifies the effectiveness of a stepsize by a feedback function motivated from a convergence measure and uses the feedback to adjust the stepsize through an online learning algorithm. Consequently, instantiations of OSGM achieve convergence rates that are asymptotically no worse than the optimal stepsize. OSGM yields desirable convergence guarantees on smooth convex problems, including 1) trajectory-dependent global convergence on smooth convex objectives; 2) an improved complexity result on smooth strongly convex problems, and 3) local superlinear convergence. Notably, OSGM constitutes a new family of first-order methods with non-asymptotic superlinear convergence, joining the celebrated quasi-Newton methods. Finally, OSGM explains the empirical success of the popular hypergradient-descent heuristic in optimization for machine learning.

View on arXiv
@article{gao2025_2505.23081,
  title={ Gradient Methods with Online Scaling Part I. Theoretical Foundations },
  author={ Wenzhi Gao and Ya-Chi Chu and Yinyu Ye and Madeleine Udell },
  journal={arXiv preprint arXiv:2505.23081},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.