Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.14662
Cited By
Relational Composition in Neural Networks: A Survey and Call to Action
19 July 2024
Martin Wattenberg
Fernanda Viégas
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Relational Composition in Neural Networks: A Survey and Call to Action"
7 / 7 papers shown
Title
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
111
7
0
06 Feb 2025
Towards Unifying Interpretability and Control: Evaluation via Intervention
Usha Bhalla
Suraj Srinivas
Asma Ghandeharioun
Himabindu Lakkaraju
40
5
0
07 Nov 2024
Residual Stream Analysis with Multi-Layer SAEs
Tim Lawson
Lucy Farnik
Conor Houghton
Laurence Aitchison
26
3
0
06 Sep 2024
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
58
13
0
13 Jun 2024
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
Samuel Marks
Max Tegmark
HILM
102
169
0
10 Oct 2023
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
125
318
0
21 Sep 2022
Compositionality as Lexical Symmetry
Ekin Akyürek
Jacob Andreas
CoGe
50
8
0
30 Jan 2022
1