ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.21364
  4. Cited By
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders

Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders

27 May 2025
James Oldfield
Shawn Im
Yixuan Li
M. Nicolaou
Ioannis Patras
Grigorios G. Chrysos
    MoE
ArXivPDFHTML

Papers citing "Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders"

13 / 13 papers shown
Title
Hadamard product in deep learning: Introduction, Advances and Challenges
Hadamard product in deep learning: Introduction, Advances and Challenges
Grigorios G. Chrysos
Yongtao Wu
Razvan Pascanu
Philip Torr
Volkan Cevher
AAML
142
2
0
17 Apr 2025
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
Adam Karvonen
Can Rager
Johnny Lin
Curt Tigges
Joseph Isaac Bloom
...
Matthew Wearden
Arthur Conmy
Arthur Conmy
Samuel Marks
Neel Nanda
MU
125
19
0
12 Mar 2025
Closed-Form Feedback-Free Learning with Forward Projection
Closed-Form Feedback-Free Learning with Forward Projection
Robert O'Shea
Bipin Rajendran
45
0
0
27 Jan 2025
Decomposing The Dark Matter of Sparse Autoencoders
Decomposing The Dark Matter of Sparse Autoencoders
Joshua Engels
Logan Riggs
Max Tegmark
LLMSV
70
12
0
18 Oct 2024
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
David Chanin
James Wilken-Smith
Tomáš Dulka
Hardik Bhatnagar
Joseph Bloom
Joseph Isaac Bloom
63
31
0
22 Sep 2024
Multilinear Mixture of Experts: Scalable Expert Specialization through
  Factorization
Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
James Oldfield
Markos Georgopoulos
Grigorios G. Chrysos
Christos Tzelepis
Yannis Panagakis
M. Nicolaou
Jiankang Deng
Ioannis Patras
MoE
60
9
0
19 Feb 2024
Discovering Latent Knowledge in Language Models Without Supervision
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
99
350
0
07 Dec 2022
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture
  of Experts
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLM
MoE
108
190
0
06 Jun 2022
PandA: Unsupervised Learning of Parts and Appearances in the Feature
  Maps of GANs
PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs
James Oldfield
Christos Tzelepis
Yannis Panagakis
M. Nicolaou
Ioannis Patras
GAN
62
24
0
31 May 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
135
191
0
17 Feb 2022
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Nan Du
Yanping Huang
Andrew M. Dai
Simon Tong
Dmitry Lepikhin
...
Kun Zhang
Quoc V. Le
Yonghui Wu
Zhiwen Chen
Claire Cui
ALM
MoE
151
794
0
13 Dec 2021
Dynamic Neural Networks: A Survey
Dynamic Neural Networks: A Survey
Yizeng Han
Gao Huang
Shiji Song
Le Yang
Honghui Wang
Yulin Wang
3DH
AI4TS
AI4CE
54
638
0
09 Feb 2021
The Mythos of Model Interpretability
The Mythos of Model Interpretability
Zachary Chase Lipton
FaML
121
3,672
0
10 Jun 2016
1