Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.05110
Cited By
Opening the AI black box: program synthesis via mechanistic interpretability
7 February 2024
Eric J. Michaud
Isaac Liao
Vedang Lad
Ziming Liu
Anish Mudide
Chloe Loughridge
Zifan Carl Guo
Tara Rezaei Kheirkhah
Mateja Vukelić
Max Tegmark
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Opening the AI black box: program synthesis via mechanistic interpretability"
8 / 8 papers shown
Title
Attribution Patching Outperforms Automated Circuit Discovery
Aaquib Syed
Can Rager
Arthur Conmy
125
65
0
16 Oct 2023
Provably safe systems: the only path to controllable AGI
Max Tegmark
Steve Omohundro
61
23
0
05 Sep 2023
Learning the greatest common divisor: explaining transformer predictions
Franccois Charton
43
18
0
29 Aug 2023
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks
Ziqian Zhong
Ziming Liu
Max Tegmark
Jacob Andreas
65
100
0
30 Jun 2023
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
122
370
0
07 Dec 2022
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
314
514
0
24 Sep 2022
Acquisition of Chess Knowledge in AlphaZero
Thomas McGrath
A. Kapishnikov
Nenad Tomašev
Adam Pearce
Demis Hassabis
Been Kim
Ulrich Paquet
Vladimir Kramnik
55
164
0
17 Nov 2021
AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity
S. Udrescu
A. Tan
Jiahai Feng
Orisvaldo Neto
Tailin Wu
Max Tegmark
65
191
0
18 Jun 2020
1