ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.11534
  4. Cited By
Decomposing and Editing Predictions by Modeling Model Computation

Decomposing and Editing Predictions by Modeling Model Computation

17 April 2024
Harshay Shah
Andrew Ilyas
A. Madry
    KELM
ArXivPDFHTML

Papers citing "Decomposing and Editing Predictions by Modeling Model Computation"

10 / 10 papers shown
Title
Learning to Attribute with Attention
Learning to Attribute with Attention
Benjamin Cohen-Wang
Yung-Sung Chuang
Aleksander Madry
30
0
0
18 Apr 2025
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Shichang Zhang
Tessa Han
Usha Bhalla
Hima Lakkaraju
FAtt
147
0
0
17 Feb 2025
Jet Expansions of Residual Computation
Jet Expansions of Residual Computation
Yihong Chen
Xiangxiang Xu
Yao Lu
Pontus Stenetorp
Luca Franceschi
34
3
0
08 Oct 2024
CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models
  Using Discrete Concept
CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept
YuXuan Wu
Bonaventure F. P. Dossou
Dianbo Liu
MU
21
0
0
08 Oct 2024
Optimal ablation for interpretability
Optimal ablation for interpretability
Maximilian Li
Lucas Janson
FAtt
49
2
0
16 Sep 2024
ContextCite: Attributing Model Generation to Context
ContextCite: Attributing Model Generation to Context
Benjamin Cohen-Wang
Harshay Shah
Kristian Georgiev
Aleksander Madry
LRM
30
18
0
01 Sep 2024
When Parts are Greater Than Sums: Individual LLM Components Can
  Outperform Full Models
When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models
Ting-Yun Chang
Jesse Thomason
Robin Jia
45
4
0
19 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
59
16
0
06 Jun 2024
Decomposing and Interpreting Image Representations via Text in ViTs
  Beyond CLIP
Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP
S. Balasubramanian
Samyadeep Basu
S. Feizi
CLIP
31
3
0
03 Jun 2024
Learned feature representations are biased by complexity, learning
  order, position, and more
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CE
FaML
SSL
OOD
34
6
0
09 May 2024
1