Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.15435
Cited By
Grokking phase transitions in learning local rules with gradient descent
26 October 2022
Bojan Žunkovič
E. Ilievski
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Grokking phase transitions in learning local rules with gradient descent"
19 / 19 papers shown
Title
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Zhiwei Xu
Zhiyu Ni
Yixin Wang
Wei Hu
CLL
32
0
0
17 Apr 2025
Grokking Explained: A Statistical Phenomenon
B. W. Carvalho
Artur Garcez
Luís C. Lamb
Emílio Vital Brazil
64
0
0
03 Feb 2025
Grokking at the Edge of Numerical Stability
Lucas Prieto
Melih Barsbey
Pedro A.M. Mediano
Tolga Birdal
34
3
0
08 Jan 2025
Understanding the Generalization Benefits of Late Learning Rate Decay
Yinuo Ren
Chao Ma
Lexing Ying
AI4CE
24
6
0
21 Jan 2024
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
36
32
0
30 Nov 2023
Understanding Grokking Through A Robustness Viewpoint
Zhiquan Tan
Weiran Huang
AAML
OOD
30
6
0
11 Nov 2023
Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity
Jack Miller
Charles OÑeill
Thang Bui
24
9
0
26 Oct 2023
To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets
Darshil Doshi
Aritra Das
Tianyu He
Andrey Gromov
OOD
32
6
0
19 Oct 2023
Grokking as a First Order Phase Transition in Two Layer Networks
Noa Rubin
Inbar Seroussi
Z. Ringel
31
15
0
05 Oct 2023
Benign Overfitting and Grokking in ReLU Networks for XOR Cluster Data
Zhiwei Xu
Yutong Wang
Spencer Frei
Gal Vardi
Wei Hu
MLT
26
23
0
04 Oct 2023
The semantic landscape paradigm for neural networks
Shreyas Gokhale
21
2
0
18 Jul 2023
Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
Pascal Junior Tikeng Notsawo
Hattie Zhou
Mohammad Pezeshki
Irina Rish
G. Dumas
17
23
0
23 Jun 2023
Grokking modular arithmetic
Andrey Gromov
35
37
0
06 Jan 2023
Positive unlabeled learning with tensor networks
Bojan Žunkovič
SSL
33
4
0
25 Nov 2022
Deep tensor networks with matrix product operators
Bojan Žunkovič
62
4
0
16 Sep 2022
From Tensor Network Quantum States to Tensorial Recurrent Neural Networks
Dian Wu
R. Rossi
F. Vicentini
Giuseppe Carleo
96
25
0
24 Jun 2022
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
53
25
0
06 Dec 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
122
165
0
29 Jan 2021
Modeling Sequences with Quantum States: A Look Under the Hood
T. Bradley
Miles E. Stoudenmire
John Terilla
68
48
0
16 Oct 2019
1