ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.12022
  4. Cited By
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

8 February 2023
Maor Ivgi
Oliver Hinder
Y. Carmon
    ODL
ArXivPDFHTML

Papers citing "DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule"

39 / 39 papers shown
Title
A Hessian-informed hyperparameter optimization for differential learning rate
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
87
1
0
12 Jan 2025
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
Volkan Cevher
Mingyi Hong
MU
113
10
0
29 Oct 2024
Parameter-free Regret in High Probability with Heavy Tails
Parameter-free Regret in High Probability with Heavy Tails
Jiujia Zhang
Ashok Cutkosky
46
20
0
25 Oct 2022
Making SGD Parameter-Free
Making SGD Parameter-Free
Y. Carmon
Oliver Hinder
77
45
0
04 May 2022
Parameter-free Mirror Descent
Parameter-free Mirror Descent
Andrew Jacobsen
Ashok Cutkosky
59
34
0
26 Feb 2022
The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded
  Gradients and Affine Variance
The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded Gradients and Affine Variance
Matthew Faw
Isidoros Tziotis
Constantine Caramanis
Aryan Mokhtari
Sanjay Shakkottai
Rachel A. Ward
55
59
0
11 Feb 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
159
5,167
0
10 Jan 2022
Datasets: A Community Library for Natural Language Processing
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
210
610
0
07 Sep 2021
Large-Scale Methods for Distributionally Robust Optimization
Large-Scale Methods for Distributionally Robust Optimization
Daniel Levy
Y. Carmon
John C. Duchi
Aaron Sidford
73
217
0
12 Oct 2020
Array Programming with NumPy
Array Programming with NumPy
Charles R. Harris
K. Millman
S. Walt
R. Gommers
Pauli Virtanen
...
Tyler Reddy
Warren Weckesser
Hameer Abbasi
C. Gohlke
T. Oliphant
147
14,953
0
18 Jun 2020
Better Parameter-free Stochastic Optimization with ODE Updates for
  Coin-Betting
Better Parameter-free Stochastic Optimization with ODE Updates for Coin-Betting
K. Chen
John Langford
Francesco Orabona
31
21
0
12 Jun 2020
Lipschitz and Comparator-Norm Adaptivity in Online Learning
Lipschitz and Comparator-Norm Adaptivity in Online Learning
Zakaria Mhammedi
Wouter M. Koolen
61
56
0
27 Feb 2020
Online Learning with Imperfect Hints
Online Learning with Imperfect Hints
Aditya Bhaskara
Ashok Cutkosky
Ravi Kumar
Manish Purohit
101
58
0
11 Feb 2020
On the distance between two neural networks and the stability of
  learning
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
229
58
0
09 Feb 2020
A Large-scale Study of Representation Learning with the Visual Task
  Adaptation Benchmark
A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
Xiaohua Zhai
J. Puigcerver
Alexander Kolesnikov
P. Ruyssen
C. Riquelme
...
Michael Tschannen
Marcin Michalski
Olivier Bousquet
Sylvain Gelly
N. Houlsby
SSL
79
439
0
01 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
612
24,431
0
26 Jul 2019
Unlabeled Data Improves Adversarial Robustness
Unlabeled Data Improves Adversarial Robustness
Y. Carmon
Aditi Raghunathan
Ludwig Schmidt
Percy Liang
John C. Duchi
121
752
0
31 May 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and
  Convergence Rates
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Sharan Vaswani
Aaron Mishkin
I. Laradji
Mark Schmidt
Gauthier Gidel
Simon Lacoste-Julien
ODL
81
209
0
24 May 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
256
2,309
0
02 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
230
996
0
01 Apr 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
Yi Zhou
Junjie Yang
Huishuai Zhang
Yingbin Liang
Vahid Tarokh
52
74
0
02 Jan 2019
Neural Network Acceptability Judgments
Neural Network Acceptability Judgments
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
230
1,407
0
31 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,154
0
20 Apr 2018
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Noam M. Shazeer
Mitchell Stern
ODL
76
1,047
0
11 Apr 2018
Shampoo: Preconditioned Stochastic Tensor Optimization
Shampoo: Preconditioned Stochastic Tensor Optimization
Vineet Gupta
Tomer Koren
Y. Singer
ODL
82
219
0
26 Feb 2018
Large Batch Training of Convolutional Networks
Large Batch Training of Convolutional Networks
Yang You
Igor Gitman
Boris Ginsburg
ODL
128
848
0
13 Aug 2017
A Unified Approach to Adaptive Regularization in Online and Stochastic
  Optimization
A Unified Approach to Adaptive Regularization in Online and Stochastic Optimization
Vineet Gupta
Tomer Koren
Y. Singer
30
22
0
20 Jun 2017
A Broad-Coverage Challenge Corpus for Sentence Understanding through
  Inference
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
517
4,476
0
18 Apr 2017
Remote Sensing Image Scene Classification: Benchmark and State of the
  Art
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
101
2,255
0
01 Mar 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
295
2,375
0
20 Dec 2016
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN
3DV
766
36,794
0
25 Aug 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
274
8,127
0
16 Jun 2016
Wide Residual Networks
Wide Residual Networks
Sergey Zagoruyko
N. Komodakis
334
7,984
0
23 May 2016
Going Deeper with Convolutions
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
457
43,649
0
17 Sep 2014
Describing Textures in the Wild
Describing Textures in the Wild
Mircea Cimpoi
Subhransu Maji
Iasonas Kokkinos
S. Mohamed
Andrea Vedaldi
3DV
116
2,671
0
14 Nov 2013
Stochastic Gradient Descent for Non-smooth Optimization: Convergence
  Results and Optimal Averaging Schemes
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
146
574
0
08 Dec 2012
No-Regret Algorithms for Unconstrained Online Convex Optimization
No-Regret Algorithms for Unconstrained Online Convex Optimization
Matthew J. Streeter
H. B. McMahan
ODL
78
89
0
09 Nov 2012
No More Pesky Learning Rates
No More Pesky Learning Rates
Tom Schaul
Sixin Zhang
Yann LeCun
137
478
0
06 Jun 2012
Information-theoretic lower bounds on the oracle complexity of
  stochastic convex optimization
Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization
Alekh Agarwal
Peter L. Bartlett
Pradeep Ravikumar
Martin J. Wainwright
190
250
0
03 Sep 2010
1