Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.08590
Cited By
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
14 June 2023
Nikhil Vyas
Depen Morwani
Rosie Zhao
Gal Kaplun
Sham Kakade
Boaz Barak
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning"
10 / 10 papers shown
Title
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
44
4
0
10 Jan 2025
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
Cengiz Pehlevan
55
2
0
06 Oct 2024
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
59
12
0
26 Sep 2024
A Quadratic Synchronization Rule for Distributed Deep Learning
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
54
1
0
22 Oct 2023
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Fuzhao Xue
Yao Fu
Wangchunshu Zhou
Zangwei Zheng
Yang You
88
79
0
22 May 2023
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
252
474
0
24 Sep 2022
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
Stanislaw Jastrzebski
Devansh Arpit
Oliver Åstrand
Giancarlo Kerg
Huan Wang
Caiming Xiong
R. Socher
Kyunghyun Cho
Krzysztof J. Geras
AI4CE
184
66
0
28 Dec 2020
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
236
0
04 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
310
2,896
0
15 Sep 2016
1