ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.06110
  4. Cited By
Grokking as the Transition from Lazy to Rich Training Dynamics

Grokking as the Transition from Lazy to Rich Training Dynamics

9 October 2023
Tanishq Kumar
Blake Bordelon
Samuel Gershman
Cengiz Pehlevan
ArXivPDFHTML

Papers citing "Grokking as the Transition from Lazy to Rich Training Dynamics"

16 / 16 papers shown
Title
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
143
1
0
28 Feb 2025
Grokking Explained: A Statistical Phenomenon
Grokking Explained: A Statistical Phenomenon
B. W. Carvalho
Artur Garcez
Luís C. Lamb
Emílio Vital Brazil
100
0
0
03 Feb 2025
Grokking at the Edge of Numerical Stability
Grokking at the Edge of Numerical Stability
Lucas Prieto
Melih Barsbey
Pedro A.M. Mediano
Tolga Birdal
111
3
0
08 Jan 2025
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Ann Huang
Satpreet H. Singh
Flavio Martinelli
Kanaka Rajan
57
0
0
04 Oct 2024
Learning time-scales in two-layers neural networks
Learning time-scales in two-layers neural networks
Raphael Berthier
Andrea Montanari
Kangjie Zhou
101
36
0
28 Feb 2023
Grokking modular arithmetic
Grokking modular arithmetic
Andrey Gromov
79
41
0
06 Jan 2023
Learning Single-Index Models with Shallow Neural Networks
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
184
70
0
27 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
78
82
0
03 Oct 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the
  Computational Limit
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
72
132
0
18 Jul 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step
  Improves the Representation
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
78
127
0
03 May 2022
The high-dimensional asymptotics of first order methods with random data
The high-dimensional asymptotics of first order methods with random data
Michael Celentano
Chen Cheng
Andrea Montanari
AI4CE
30
38
0
14 Dec 2021
Geometric compression of invariant manifolds in neural nets
Geometric compression of invariant manifolds in neural nets
J. Paccolat
Leonardo Petrini
Mario Geiger
Kevin Tyloo
Matthieu Wyart
MLT
88
35
0
22 Jul 2020
On the training dynamics of deep networks with $L_2$ regularization
On the training dynamics of deep networks with L2L_2L2​ regularization
Aitor Lewkowycz
Guy Gur-Ari
82
53
0
15 Jun 2020
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural
  Networks
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
Blake Bordelon
Abdulkadir Canatar
Cengiz Pehlevan
205
206
0
07 Feb 2020
Limitations of Lazy Training of Two-layers Neural Networks
Limitations of Lazy Training of Two-layers Neural Networks
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
55
143
0
21 Jun 2019
Algorithms for Learning Kernels Based on Centered Alignment
Algorithms for Learning Kernels Based on Centered Alignment
Corinna Cortes
M. Mohri
Afshin Rostamizadeh
65
544
0
02 Mar 2012
1