Bilinear MLPs enable weight-based mechanistic interpretability

10 October 2024

Papers citing "Bilinear MLPs enable weight-based mechanistic interpretability"

4 / 4 papers shown

Title
Parameterized Synthetic Text Generation with SimpleStories Lennart Finke Chandan Sreedhara Thomas Dooms Mat Allen Emerald Zhang Juan Diego Rodriguez Noa Nabeshima Thomas Marshall Dan Braun SyDa 32 0 0 12 Apr 2025
Compositionality Unlocks Deep Interpretable Models Thomas Dooms Ward Gauderis Geraint A. Wiggins José Oramas FAtt CoGe AI4CE 67 0 0 03 Apr 2025
Mixture of Experts Made Intrinsically Interpretable Xingyi Yang Constantin Venhoff Ashkan Khakzar Christian Schroeder de Witt P. Dokania Adel Bibi Philip Torr MoE 57 0 0 05 Mar 2025
Bilinear Convolution Decomposition for Causal RL Interpretability Narmeen Oozeer Sinem Erisken Alice Rigg 65 0 0 01 Dec 2024