ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.04185
  4. Cited By
Residual Stream Analysis with Multi-Layer SAEs

Residual Stream Analysis with Multi-Layer SAEs

6 September 2024
Tim Lawson
Lucy Farnik
Conor Houghton
Laurence Aitchison
ArXivPDFHTML

Papers citing "Residual Stream Analysis with Multi-Layer SAEs"

6 / 6 papers shown
Title
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik
Tim Lawson
Conor Houghton
Laurence Aitchison
61
0
0
25 Feb 2025
Transformer Dynamics: A neuroscientific approach to interpretability of large language models
Transformer Dynamics: A neuroscientific approach to interpretability of large language models
Jesseba Fernando
Grigori Guitchounts
AI4CE
41
0
0
17 Feb 2025
Steering Language Model Refusal with Sparse Autoencoders
Kyle O'Brien
David Majercak
Xavier Fernandes
Richard Edgar
Jingya Chen
Harsha Nori
Dean Carignan
Eric Horvitz
Forough Poursabzi-Sangde
LLMSV
67
10
0
18 Nov 2024
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
212
497
0
01 Nov 2022
Disentanglement with Biological Constraints: A Theory of Functional Cell
  Types
Disentanglement with Biological Constraints: A Theory of Functional Cell Types
James C. R. Whittington
W. Dorrell
Surya Ganguli
Timothy Edward John Behrens
47
48
0
30 Sep 2022
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
131
322
0
21 Sep 2022
1