Making SGD Parameter-Free

4 May 2022

Papers citing "Making SGD Parameter-Free"

33 / 33 papers shown

Title
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation Robert M. Gower Guillaume Garrigos Nicolas Loizou Dimitris Oikonomou Konstantin Mishchenko Fabian Schaipp 31 0 0 02 Apr 2025
Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization Amit Attia Tomer Koren 67 1 0 13 Mar 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices Lixing Lyu Jiashuo Jiang Wang Chi Cheung 42 1 0 24 Feb 2025
Tuning-free coreset Markov chain Monte Carlo Naitong Chen Jonathan H. Huggins Trevor Campbell 25 0 0 24 Oct 2024
State-free Reinforcement Learning Mingyu Chen Aldo Pacchiano Xuezhou Zhang 61 0 0 27 Sep 2024
Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds Daniel Dodd Louis Sharrock Christopher Nemeth 36 0 0 04 Jun 2024
Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions Wei Jiang Sifan Yang Yibo Wang Lijun Zhang 28 1 0 04 Jun 2024
Fully Unconstrained Online Learning Ashok Cutkosky Zakaria Mhammedi CLL 27 1 0 30 May 2024
Towards Stability of Parameter-free Optimization Yijiang Pang Shuyang Yu Hoang Bao Jiayu Zhou 29 1 0 07 May 2024
Directional Smoothness and Gradient Methods: Convergence and Adaptivity Aaron Mishkin Ahmed Khaled Yuanhao Wang Aaron Defazio Robert Mansel Gower 36 6 0 06 Mar 2024
The Price of Adaptivity in Stochastic Convex Optimization Y. Carmon Oliver Hinder 26 6 0 16 Feb 2024
Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization Jiaxiang Li Xuxing Chen Shiqian Ma Mingyi Hong ODL 19 2 0 13 Feb 2024
Tuning-Free Stochastic Optimization Ahmed Khaled Chi Jin 32 7 0 12 Feb 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size P. Ostroukhov Aigerim Zhumabayeva Chulu Xiang Alexander Gasnikov Martin Takáč Dmitry Kamzolov ODL 43 2 0 07 Feb 2024
How Free is Parameter-Free Stochastic Optimization? Amit Attia Tomer Koren ODL 44 4 0 05 Feb 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms Farshed Abdukhakimov Chulu Xiang Dmitry Kamzolov Robert Mansel Gower Martin Takáč 35 2 0 28 Dec 2023
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions Y. Carmon A. Jambulapati Yujia Jin Aaron Sidford 21 4 0 17 Nov 2023
Non-Uniform Smoothness for Gradient Descent A. Berahas Lindon Roberts Fred Roosta 32 3 0 15 Nov 2023
A simple uniformly optimal method without line search for convex optimization Tianjiao Li Guanghui Lan 26 20 0 16 Oct 2023
Normalized Gradients for All Francesco Orabona 20 8 0 10 Aug 2023
Prodigy: An Expeditiously Adaptive Parameter-Free Learner Konstantin Mishchenko Aaron Defazio ODL 28 55 0 09 Jun 2023
Mechanic: A Learning Rate Tuner Ashok Cutkosky Aaron Defazio Harsh Mehta OffRL 19 15 0 31 May 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method Ahmed Khaled Konstantin Mishchenko Chi Jin ODL 22 22 0 25 May 2023
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation Zijian Liu Zhengyuan Zhou 24 10 0 22 Mar 2023
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance Amit Attia Tomer Koren ODL 17 24 0 17 Feb 2023
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule Maor Ivgi Oliver Hinder Y. Carmon ODL 26 56 0 08 Feb 2023
Learning-Rate-Free Learning by D-Adaptation Aaron Defazio Konstantin Mishchenko 24 76 0 18 Jan 2023
Parameter-free Regret in High Probability with Heavy Tails Jiujia Zhang Ashok Cutkosky 14 20 0 25 Oct 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization Junchi Yang Xiang Li Niao He ODL 27 22 0 01 Jun 2022
Learning to Accelerate by the Methods of Step-size Planning Hengshuai Yao 21 0 0 01 Apr 2022
Robust Linear Regression for General Feature Distribution Tom Norman Nir Weinberger Kfir Y. Levy OOD 17 2 0 04 Feb 2022
L4: Practical loss-based stepsize adaptation for deep learning Michal Rolínek Georg Martius ODL 36 63 0 14 Feb 2018
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes Ohad Shamir Tong Zhang 101 570 0 08 Dec 2012