Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.04817
Cited By
The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
10 June 2022
Vimal Thilak
Etai Littwin
Shuangfei Zhai
Omid Saremi
Roni Paiss
J. Susskind
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon"
12 / 12 papers shown
Title
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
201
0
0
29 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
Xinyu Zhou
Simin Fan
Martin Jaggi
Jie Fu
41
0
0
24 Apr 2025
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Zepeng Zhai
K. Mao
AAML
VLM
43
0
0
04 Apr 2025
Grokking at the Edge of Numerical Stability
Lucas Prieto
Melih Barsbey
Pedro A.M. Mediano
Tolga Birdal
51
3
0
08 Jan 2025
Survival of the Fittest Representation: A Case Study with Modular Addition
Xiaoman Delores Ding
Zifan Carl Guo
Eric J. Michaud
Ziming Liu
Max Tegmark
48
3
0
27 May 2024
Grokking as Compression: A Nonlinear Complexity Perspective
Ziming Liu
Ziqian Zhong
Max Tegmark
38
9
0
09 Oct 2023
Benign Overfitting and Grokking in ReLU Networks for XOR Cluster Data
Zhiwei Xu
Yutong Wang
Spencer Frei
Gal Vardi
Wei Hu
MLT
28
24
0
04 Oct 2023
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
40
86
0
25 Sep 2023
Progress measures for grokking via mechanistic interpretability
Neel Nanda
Lawrence Chan
Tom Lieberum
Jess Smith
Jacob Steinhardt
49
386
0
12 Jan 2023
Grokking phase transitions in learning local rules with gradient descent
Bojan Žunkovič
E. Ilievski
63
17
0
26 Oct 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
83
91
0
19 May 2022
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
235
0
04 Mar 2020
1