Grokking as the Transition from Lazy to Rich Training Dynamics

Grokking as the Transition from Lazy to Rich Training Dynamics

9 October 2023

Samuel Gershman

Cengiz Pehlevan

Papers citing "Grokking as the Transition from Lazy to Rich Training Dynamics"

16 / 16 papers shown

Title
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking) Yoonsoo Nam Seok Hyeong Lee Clementine Domine Yea Chan Park Charles London Wonyl Choi Niclas Goring Seungjai Lee AI4CE 143 1 0 28 Feb 2025
Grokking Explained: A Statistical Phenomenon B. W. Carvalho Artur Garcez Luís C. Lamb Emílio Vital Brazil 100 0 0 03 Feb 2025
Grokking at the Edge of Numerical Stability Lucas Prieto Melih Barsbey Pedro A.M. Mediano Tolga Birdal 111 3 0 08 Jan 2025
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks Ann Huang Satpreet H. Singh Flavio Martinelli Kanaka Rajan 57 0 0 04 Oct 2024
Learning time-scales in two-layers neural networks Raphael Berthier Andrea Montanari Kangjie Zhou 101 36 0 28 Feb 2023
Grokking modular arithmetic Andrey Gromov 79 41 0 06 Jan 2023
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 184 70 0 27 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data Ziming Liu Eric J. Michaud Max Tegmark 78 82 0 03 Oct 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit Boaz Barak Benjamin L. Edelman Surbhi Goel Sham Kakade Eran Malach Cyril Zhang 72 132 0 18 Jul 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 78 127 0 03 May 2022
The high-dimensional asymptotics of first order methods with random data Michael Celentano Chen Cheng Andrea Montanari AI4CE 30 38 0 14 Dec 2021
Geometric compression of invariant manifolds in neural nets J. Paccolat Leonardo Petrini Mario Geiger Kevin Tyloo Matthieu Wyart MLT 88 35 0 22 Jul 2020
On the training dynamics of deep networks with $L_2$ regularization Aitor Lewkowycz Guy Gur-Ari 82 53 0 15 Jun 2020
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks Blake Bordelon Abdulkadir Canatar Cengiz Pehlevan 205 206 0 07 Feb 2020
Limitations of Lazy Training of Two-layers Neural Networks Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 55 143 0 21 Jun 2019
Algorithms for Learning Kernels Based on Centered Alignment Corinna Cortes M. Mohri Afshin Rostamizadeh 65 544 0 02 Mar 2012