Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.20233
Cited By
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
30 May 2024
Jaerin Lee
Bong Gyun Kang
Kihoon Kim
Kyoung Mu Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
10 / 10 papers shown
Title
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
372
0
0
29 Apr 2025
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Zepeng Zhai
K. Mao
AAML
VLM
87
0
0
04 Apr 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
114
0
0
28 Feb 2025
Grokking modular arithmetic
Andrey Gromov
69
39
0
06 Jan 2023
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
78
81
0
03 Oct 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
67
128
0
18 Jul 2022
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
292
42,038
0
03 Dec 2019
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
176
1,628
0
28 Dec 2018
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
289
10,412
0
21 Jul 2016
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
210
18,534
0
06 Feb 2015
1