Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.01870
Cited By
DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models
3 October 2023
Albert Garde
Esben Kran
Fazl Barez
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models"
8 / 8 papers shown
Title
Quantifying Feature Space Universality Across Large Language Models via Sparse Autoencoders
Michael Lan
Philip Torr
Austin Meek
Ashkan Khakzar
David M. Krueger
Fazl Barez
43
11
0
09 Oct 2024
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Wes Gurnee
Neel Nanda
Matthew Pauly
Katherine Harvey
Dmitrii Troitskii
Dimitris Bertsimas
MILM
162
190
0
02 May 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
212
497
0
01 Nov 2022
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
463
0
24 Sep 2022
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
131
322
0
21 Sep 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,000
0
31 Dec 2020
Methods for Interpreting and Understanding Deep Neural Networks
G. Montavon
Wojciech Samek
K. Müller
FaML
234
2,238
0
24 Jun 2017
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
296
31,267
0
16 Jan 2013
1