Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.04131
Cited By
Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
7 November 2023
Michael Lan
Phillip H. S. Torr
Fazl Barez
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models"
7 / 7 papers shown
Title
Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation
Reilly Haskins
Benjamin Adams
16
0
0
16 May 2025
Optimal ablation for interpretability
Maximilian Li
Lucas Janson
FAtt
49
2
0
16 Sep 2024
Do Llamas Work in English? On the Latent Language of Multilingual Transformers
Chris Wendler
V. Veselovsky
Giovanni Monea
Robert West
56
101
0
16 Feb 2024
Understanding Addition in Transformers
Philip Quirke
Fazl Barez
16
15
0
19 Oct 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
351
2,232
0
22 Mar 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
212
507
0
01 Nov 2022
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
474
0
24 Sep 2022
1