SplInterp: Improving our Understanding and Training of Sparse Autoencoders

17 May 2025

Papers citing "SplInterp: Improving our Understanding and Training of Sparse Autoencoders"

5 / 5 papers shown

Title
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry Sai Sumedh R. Hindupur Ekdeep Singh Lubana Thomas Fel Demba Ba 89 9 0 03 Mar 2025
Sparse Autoencoders Do Not Find Canonical Units of Analysis Patrick Leask Bart Bussmann Michael T. Pearce Joseph Isaac Bloom Curt Tigges Noura Al Moubayed Lee D. Sharkey Neel Nanda 88 13 0 07 Feb 2025
Sparse Autoencoders Can Interpret Randomly Initialized Transformers Thomas Heap Tim Lawson Lucy Farnik Laurence Aitchison 57 16 0 29 Jan 2025
Toy Models of Superposition Nelson Elhage Tristan Hume Catherine Olsson Nicholas Schiefer T. Henighan ... Sam McCandlish Jared Kaplan Dario Amodei Martin Wattenberg C. Olah AAML MILM 172 365 0 21 Sep 2022
Calculus of the exponent of Kurdyka-Łojasiewicz inequality and its applications to linear convergence of first-order methods Guoyin Li Ting Kei Pong 138 295 0 09 Feb 2016