Survival of the Fittest Representation: A Case Study with Modular
Addition

Survival of the Fittest Representation: A Case Study with Modular Addition

27 May 2024

Xiaoman Delores Ding

Eric J. Michaud

Papers citing "Survival of the Fittest Representation: A Case Study with Modular Addition"

12 / 12 papers shown

Title
Harmonic Loss Trains Interpretable AI Models David D. Baek Ziming Liu Riya Tyagi Max Tegmark 92 2 0 03 Feb 2025
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics Yaniv Nikankin Anja Reusch Aaron Mueller Yonatan Belinkov AIFin LRM 33 21 0 28 Oct 2024
The Platonic Representation Hypothesis Minyoung Huh Brian Cheung Tongzhou Wang Phillip Isola 72 110 0 13 May 2024
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Aaditya K. Singh Ted Moskovitz Felix Hill Stephanie C. Y. Chan Andrew M. Saxe AI4CE 42 25 0 10 Apr 2024
A Resource Model For Neural Scaling Law Jinyeop Song Ziming Liu Max Tegmark Jeff Gore 91 4 0 07 Feb 2024
Increasing Trust in Language Models through the Reuse of Verified Circuits Philip Quirke Clement Neo Fazl Barez KELM LRM 36 3 0 04 Feb 2024
Universal Neurons in GPT2 Language Models Wes Gurnee Theo Horsley Zifan Carl Guo Tara Rezaei Kheirkhah Qinyi Sun Will Hathaway Neel Nanda Dimitris Bertsimas MILM 94 37 0 22 Jan 2024
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets Samuel Marks Max Tegmark HILM 102 168 0 10 Oct 2023
Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle Rylan Schaeffer Mikail Khona Zachary Robertson Akhilan Boopathy Kateryna Pistunova J. Rocks Ila Rani Fiete Oluwasanmi Koyejo 62 31 0 24 Mar 2023
Omnigrok: Grokking Beyond Algorithmic Data Ziming Liu Eric J. Michaud Max Tegmark 56 76 0 03 Oct 2022
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 247 458 0 24 Sep 2022
Contrastive Representation Learning: A Framework and Review Phúc H. Lê Khắc Graham Healy A. Smeaton SSL AI4TS 164 684 0 10 Oct 2020