Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.07568
Cited By
Feature emergence via margin maximization: case studies in algebraic tasks
13 November 2023
Depen Morwani
Benjamin L. Edelman
Costin-Andrei Oncescu
Rosie Zhao
Sham Kakade
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Feature emergence via margin maximization: case studies in algebraic tasks"
14 / 14 papers shown
Title
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Zhiwei Xu
Zhiyu Ni
Yixin Wang
Wei Hu
CLL
37
0
0
17 Apr 2025
Low Rank and Sparse Fourier Structure in Recurrent Networks Trained on Modular Addition
Akshay Rangamani
40
0
0
28 Mar 2025
Learning richness modulates equality reasoning in neural networks
William L. Tong
C. Pehlevan
47
0
0
12 Mar 2025
Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory Cortex
Tanishq Kumar
Blake Bordelon
C. Pehlevan
Venkatesh N. Murthy
Samuel Gershman
OOD
CLL
SSL
50
0
0
05 Nov 2024
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets
Yuandong Tian
54
0
0
02 Oct 2024
Emergence in non-neural models: grokking modular arithmetic via average gradient outer product
Neil Rohit Mallinar
Daniel Beaglehole
Libin Zhu
Adityanarayanan Radhakrishnan
Parthe Pandit
Misha Belkin
51
7
0
29 Jul 2024
Why Do You Grok? A Theoretical Analysis of Grokking Modular Addition
Mohamad Amin Mohamadi
Zhiyuan Li
Lei Wu
Danica J. Sutherland
48
9
0
17 Jul 2024
Pre-trained Large Language Models Use Fourier Features to Compute Addition
Tianyi Zhou
Deqing Fu
Vatsal Sharan
Robin Jia
LRM
34
9
0
05 Jun 2024
Grokking Group Multiplication with Cosets
Dashiell Stander
Qinan Yu
Honglu Fan
Stella Biderman
38
9
0
11 Dec 2023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
41
32
0
30 Nov 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
30
22
0
02 Mar 2023
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
56
76
0
03 Oct 2022
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
125
318
0
21 Sep 2022
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
50
28
0
06 Oct 2021
1