Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.03386
Cited By
Clusterability in Neural Networks
4 March 2021
Daniel Filan
Stephen Casper
Shlomi Hod
Cody Wild
Andrew Critch
Stuart J. Russell
GNN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Clusterability in Neural Networks"
8 / 8 papers shown
Title
Evaluating Explanations: An Explanatory Virtues Framework for Mechanistic Interpretability -- The Strange Science Part I.ii
Kola Ayonrinde
Louis Jaburi
XAI
80
1
0
02 May 2025
Modular Training of Neural Networks aids Interpretability
Satvik Golechha
Maheep Chaudhary
Joan Velja
Alessandro Abate
Nandi Schoots
79
0
0
04 Feb 2025
Training Neural Networks for Modularity aids Interpretability
Satvik Golechha
Dylan R. Cope
Nandi Schoots
27
0
0
24 Sep 2024
Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Ziming Liu
Eric Gan
Max Tegmark
26
36
0
04 May 2023
Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior
Jean-Stanislas Denain
Jacob Steinhardt
AAML
39
7
0
27 Jun 2022
Leveraging the Graph Structure of Neural Network Training Dynamics
Fatemeh Vahedian
Ruiyu Li
Puja Trivedi
Di Jin
Danai Koutra
AI4CE
GNN
21
3
0
09 Nov 2021
Quantifying Local Specialization in Deep Neural Networks
Shlomi Hod
Daniel Filan
Stephen Casper
Andrew Critch
Stuart J. Russell
62
10
0
13 Oct 2021
Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment
Michael Chang
Sid Kaushik
Sergey Levine
Thomas L. Griffiths
25
8
0
28 Jun 2021
1