ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.07568
  4. Cited By
Feature emergence via margin maximization: case studies in algebraic
  tasks

Feature emergence via margin maximization: case studies in algebraic tasks

13 November 2023
Depen Morwani
Benjamin L. Edelman
Costin-Andrei Oncescu
Rosie Zhao
Sham Kakade
ArXivPDFHTML

Papers citing "Feature emergence via margin maximization: case studies in algebraic tasks"

14 / 14 papers shown
Title
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Zhiwei Xu
Zhiyu Ni
Yixin Wang
Wei Hu
CLL
37
0
0
17 Apr 2025
Low Rank and Sparse Fourier Structure in Recurrent Networks Trained on Modular Addition
Low Rank and Sparse Fourier Structure in Recurrent Networks Trained on Modular Addition
Akshay Rangamani
40
0
0
28 Mar 2025
Learning richness modulates equality reasoning in neural networks
William L. Tong
C. Pehlevan
47
0
0
12 Mar 2025
Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory
  Cortex
Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory Cortex
Tanishq Kumar
Blake Bordelon
C. Pehlevan
Venkatesh N. Murthy
Samuel Gershman
OOD
CLL
SSL
50
0
0
05 Nov 2024
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in
  Neural Nets
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets
Yuandong Tian
54
0
0
02 Oct 2024
Emergence in non-neural models: grokking modular arithmetic via average
  gradient outer product
Emergence in non-neural models: grokking modular arithmetic via average gradient outer product
Neil Rohit Mallinar
Daniel Beaglehole
Libin Zhu
Adityanarayanan Radhakrishnan
Parthe Pandit
Misha Belkin
51
7
0
29 Jul 2024
Why Do You Grok? A Theoretical Analysis of Grokking Modular Addition
Why Do You Grok? A Theoretical Analysis of Grokking Modular Addition
Mohamad Amin Mohamadi
Zhiyuan Li
Lei Wu
Danica J. Sutherland
48
9
0
17 Jul 2024
Pre-trained Large Language Models Use Fourier Features to Compute
  Addition
Pre-trained Large Language Models Use Fourier Features to Compute Addition
Tianyi Zhou
Deqing Fu
Vatsal Sharan
Robin Jia
LRM
34
9
0
05 Jun 2024
Grokking Group Multiplication with Cosets
Grokking Group Multiplication with Cosets
Dashiell Stander
Qinan Yu
Honglu Fan
Stella Biderman
38
9
0
11 Dec 2023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce
  Grokking
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
41
32
0
30 Nov 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from
  KKT Conditions for Margin Maximization
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
30
22
0
02 Mar 2023
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
56
76
0
03 Oct 2022
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
125
318
0
21 Sep 2022
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
50
28
0
06 Oct 2021
1