Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.18817
Cited By
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
30 November 2023
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking"
7 / 7 papers shown
Title
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Zepeng Zhai
K. Mao
AAML
VLM
40
0
0
04 Apr 2025
Grokking at the Edge of Numerical Stability
Lucas Prieto
Melih Barsbey
Pedro A.M. Mediano
Tolga Birdal
45
3
0
08 Jan 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
85
4
0
31 Dec 2024
Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition
Kenzo Clauw
S. Stramaglia
Daniele Marinazzo
50
3
0
16 Aug 2024
Grokking phase transitions in learning local rules with gradient descent
Bojan Žunkovič
E. Ilievski
63
16
0
26 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
56
76
0
03 Oct 2022
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
90
98
0
13 Oct 2021
1