Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.20199
Cited By
Emergence in non-neural models: grokking modular arithmetic via average gradient outer product
29 July 2024
Neil Rohit Mallinar
Daniel Beaglehole
Libin Zhu
Adityanarayanan Radhakrishnan
Parthe Pandit
Misha Belkin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Emergence in non-neural models: grokking modular arithmetic via average gradient outer product"
8 / 8 papers shown
Title
Quiet Feature Learning in Algorithmic Tasks
Prudhviraj Naidu
Zixian Wang
Leon Bergen
R. Paturi
VLM
54
0
0
06 May 2025
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Zepeng Zhai
K. Mao
AAML
VLM
38
0
0
04 Apr 2025
Low Rank and Sparse Fourier Structure in Recurrent Networks Trained on Modular Addition
Akshay Rangamani
40
0
0
28 Mar 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
85
4
0
31 Dec 2024
Linear Recursive Feature Machines provably recover low-rank matrices
Adityanarayanan Radhakrishnan
Misha Belkin
D. Drusvyatskiy
58
8
0
09 Jan 2024
Understanding the Covariance Structure of Convolutional Filters
Asher Trockman
Devin Willmott
J. Zico Kolter
52
11
0
07 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
56
76
0
03 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
324
48
0
29 Sep 2022
1