Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.06110
Cited By
Grokking as the Transition from Lazy to Rich Training Dynamics
9 October 2023
Tanishq Kumar
Blake Bordelon
Samuel Gershman
Cengiz Pehlevan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Grokking as the Transition from Lazy to Rich Training Dynamics"
16 / 16 papers shown
Title
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
143
1
0
28 Feb 2025
Grokking Explained: A Statistical Phenomenon
B. W. Carvalho
Artur Garcez
Luís C. Lamb
Emílio Vital Brazil
100
0
0
03 Feb 2025
Grokking at the Edge of Numerical Stability
Lucas Prieto
Melih Barsbey
Pedro A.M. Mediano
Tolga Birdal
111
3
0
08 Jan 2025
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Ann Huang
Satpreet H. Singh
Flavio Martinelli
Kanaka Rajan
57
0
0
04 Oct 2024
Learning time-scales in two-layers neural networks
Raphael Berthier
Andrea Montanari
Kangjie Zhou
101
36
0
28 Feb 2023
Grokking modular arithmetic
Andrey Gromov
79
41
0
06 Jan 2023
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
184
70
0
27 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
78
82
0
03 Oct 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
72
132
0
18 Jul 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
78
127
0
03 May 2022
The high-dimensional asymptotics of first order methods with random data
Michael Celentano
Chen Cheng
Andrea Montanari
AI4CE
30
38
0
14 Dec 2021
Geometric compression of invariant manifolds in neural nets
J. Paccolat
Leonardo Petrini
Mario Geiger
Kevin Tyloo
Matthieu Wyart
MLT
88
35
0
22 Jul 2020
On the training dynamics of deep networks with
L
2
L_2
L
2
regularization
Aitor Lewkowycz
Guy Gur-Ari
82
53
0
15 Jun 2020
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
Blake Bordelon
Abdulkadir Canatar
Cengiz Pehlevan
205
206
0
07 Feb 2020
Limitations of Lazy Training of Two-layers Neural Networks
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
55
143
0
21 Jun 2019
Algorithms for Learning Kernels Based on Centered Alignment
Corinna Cortes
M. Mohri
Afshin Rostamizadeh
65
544
0
02 Mar 2012
1