Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce
Grokking

v1v2 (latest)

Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking

30 November 2023

ArXiv (abs)PDF HTML

Papers citing "Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking"

8 / 8 papers shown

Title
GrokAlign: Geometric Characterisation and Acceleration of Grokking Thomas Walker Ahmed Imtiaz Humayun Randall Balestriero Richard G. Baraniuk 37 0 0 14 Jun 2025
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction Junlang Qian Zixiao Zhu Hanzhang Zhou Zijian Feng Zepeng Zhai K. Mao AAML VLM 125 0 0 04 Apr 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking) Yoonsoo Nam Seok Hyeong Lee Clementine Domine Yea Chan Park Charles London Wonyl Choi Niclas Goring Seungjai Lee AI4CE 221 1 0 28 Feb 2025
Grokking at the Edge of Numerical Stability Lucas Prieto Melih Barsbey Pedro A.M. Mediano Tolga Birdal 135 5 0 08 Jan 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers Jiajun Song Zhuoyan Xu Yiqiao Zhong 160 10 0 31 Dec 2024
Emergence in non-neural models: grokking modular arithmetic via average gradient outer product Neil Rohit Mallinar Daniel Beaglehole Libin Zhu Adityanarayanan Radhakrishnan Parthe Pandit Misha Belkin 97 8 0 29 Jul 2024
A rationale from frequency perspective for grokking in training neural network Zhangchen Zhou Yaoyu Zhang Z. Xu 88 2 0 24 May 2024
Towards Uncovering How Large Language Model Works: An Explainability Perspective Haiyan Zhao Fan Yang Bo Shen Himabindu Lakkaraju Jundong Li 91 13 0 16 Feb 2024