ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.05041
  4. Cited By
Understanding the Role of Individual Units in a Deep Neural Network

Understanding the Role of Individual Units in a Deep Neural Network

10 September 2020
David Bau
Jun-Yan Zhu
Hendrik Strobelt
Àgata Lapedriza
Bolei Zhou
Antonio Torralba
    GAN
ArXivPDFHTML

Papers citing "Understanding the Role of Individual Units in a Deep Neural Network"

50 / 81 papers shown
Title
What's Pulling the Strings? Evaluating Integrity and Attribution in AI Training and Inference through Concept Shift
What's Pulling the Strings? Evaluating Integrity and Attribution in AI Training and Inference through Concept Shift
Jiamin Chang
Yiming Li
Hammond Pearce
Ruoxi Sun
Bo-wen Li
Minhui Xue
43
0
0
28 Apr 2025
Following the Whispers of Values: Unraveling Neural Mechanisms Behind Value-Oriented Behaviors in LLMs
Following the Whispers of Values: Unraveling Neural Mechanisms Behind Value-Oriented Behaviors in LLMs
Ling Hu
Yuemei Xu
Xiaoyang Gu
Letao Han
33
0
0
07 Apr 2025
Effective Skill Unlearning through Intervention and Abstention
Effective Skill Unlearning through Intervention and Abstention
Yongce Li
Chung-En Sun
Tsui-Wei Weng
MU
223
0
0
27 Mar 2025
Representational Similarity via Interpretable Visual Concepts
Representational Similarity via Interpretable Visual Concepts
Neehar Kondapaneni
Oisin Mac Aodha
Pietro Perona
DRL
240
0
0
19 Mar 2025
Superscopes: Amplifying Internal Feature Representations for Language Model Interpretation
Jonathan Jacobi
Gal Niv
LRM
ReLM
65
0
0
03 Mar 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
69
0
0
17 Feb 2025
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Shichang Zhang
Tessa Han
Usha Bhalla
Hima Lakkaraju
FAtt
157
0
0
17 Feb 2025
Dimensions underlying the representational alignment of deep neural networks with humans
Dimensions underlying the representational alignment of deep neural networks with humans
F. Mahner
Lukas Muttenthaler
Umut Güçlü
M. Hebart
48
4
0
28 Jan 2025
Faithful Counterfactual Visual Explanations (FCVE)
Faithful Counterfactual Visual Explanations (FCVE)
Bismillah Khan
Syed Ali Tariq
Tehseen Zia
Muhammad Ahsan
David Windridge
44
0
0
12 Jan 2025
Towards Counterfactual and Contrastive Explainability and Transparency of DCNN Image Classifiers
Towards Counterfactual and Contrastive Explainability and Transparency of DCNN Image Classifiers
Syed Ali Tariq
Tehseen Zia
Mubeen Ghafoor
AAML
62
7
0
12 Jan 2025
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Nick Jiang
Anish Kachinthaya
Suzie Petryk
Yossi Gandelsman
VLM
36
17
0
03 Oct 2024
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
Tung-Yu Wu
Yu-Xiang Lin
Tsui-Wei Weng
54
1
0
24 Jun 2024
Beyond Individual Facts: Investigating Categorical Knowledge Locality of
  Taxonomy and Meronomy Concepts in GPT Models
Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models
Christopher Burger
Yifan Hu
Thai Le
KELM
49
0
0
22 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
62
16
0
06 Jun 2024
Pruning for Robust Concept Erasing in Diffusion Models
Pruning for Robust Concept Erasing in Diffusion Models
Tianyun Yang
Juan Cao
Chang Xu
40
13
0
26 May 2024
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
Tianlong Wang
Xianfeng Jiao
Yifan He
Zhongzhi Chen
Yinghao Zhu
Xu Chu
Junyi Gao
Yasha Wang
Liantao Ma
LLMSV
71
8
0
26 May 2024
Error-margin Analysis for Hidden Neuron Activation Labels
Error-margin Analysis for Hidden Neuron Activation Labels
Abhilekha Dalal
R. Rayan
Pascal Hitzler
FAtt
31
1
0
14 May 2024
Linear Explanations for Individual Neurons
Linear Explanations for Individual Neurons
Tuomas P. Oikarinen
Tsui-Wei Weng
FAtt
MILM
31
6
0
10 May 2024
A Multimodal Automated Interpretability Agent
A Multimodal Automated Interpretability Agent
Tamar Rott Shaham
Sarah Schwettmann
Franklin Wang
Achyuta Rajaram
Evan Hernandez
Jacob Andreas
Antonio Torralba
39
18
0
22 Apr 2024
On the Value of Labeled Data and Symbolic Methods for Hidden Neuron
  Activation Analysis
On the Value of Labeled Data and Symbolic Methods for Hidden Neuron Activation Analysis
Abhilekha Dalal
R. Rayan
Adrita Barua
Eugene Y. Vasserman
Md Kamruzzaman Sarker
Pascal Hitzler
30
4
0
21 Apr 2024
Faster Diffusion via Temporal Attention Decomposition
Faster Diffusion via Temporal Attention Decomposition
Haozhe Liu
Wentian Zhang
Jinheng Xie
Francesco Faccio
Mengmeng Xu
Tao Xiang
Mike Zheng Shou
Juan-Manuel Perez-Rua
Jürgen Schmidhuber
DiffM
77
19
0
03 Apr 2024
Language Models Represent Beliefs of Self and Others
Language Models Represent Beliefs of Self and Others
Wentao Zhu
Zhining Zhang
Yizhou Wang
MILM
LRM
52
8
0
28 Feb 2024
Understanding the Role of Pathways in a Deep Neural Network
Understanding the Role of Pathways in a Deep Neural Network
Lei Lyu
Chen Pang
Jihua Wang
35
3
0
28 Feb 2024
Deeper Understanding of Black-box Predictions via Generalized Influence
  Functions
Deeper Understanding of Black-box Predictions via Generalized Influence Functions
Hyeonsu Lyu
Jonggyu Jang
Sehyun Ryu
H. Yang
TDI
AI4CE
27
5
0
09 Dec 2023
Conceptualizing the Relationship between AI Explanations and User Agency
Conceptualizing the Relationship between AI Explanations and User Agency
Iyadunni Adenuga
Jonathan Dodge
29
2
0
05 Dec 2023
Codebook Features: Sparse and Discrete Interpretability for Neural
  Networks
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin
Mohammad Taufeeque
Noah D. Goodman
35
27
0
26 Oct 2023
Unlearning with Fisher Masking
Unlearning with Fisher Masking
Yufang Liu
Changzhi Sun
Yuanbin Wu
Aimin Zhou
MU
23
5
0
09 Oct 2023
Explaining black box text modules in natural language with language
  models
Explaining black box text modules in natural language with language models
Chandan Singh
Aliyah R. Hsu
Richard Antonello
Shailee Jain
Alexander G. Huth
Bin-Xia Yu
Jianfeng Gao
MILM
36
47
0
17 May 2023
LINe: Out-of-Distribution Detection by Leveraging Important Neurons
LINe: Out-of-Distribution Detection by Leveraging Important Neurons
Yong Hyun Ahn
Gyeong-Moon Park
Seong Tae Kim
OODD
119
31
0
24 Mar 2023
P+: Extended Textual Conditioning in Text-to-Image Generation
P+: Extended Textual Conditioning in Text-to-Image Generation
A. Voynov
Qinghao Chu
Daniel Cohen-Or
Kfir Aberman
VLM
DiffM
51
176
0
16 Mar 2023
Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
  on AI-based Recruitment
Human-Centric Multimodal Machine Learning: Recent Advances and Testbed on AI-based Recruitment
Alejandro Peña
Ignacio Serna
Aythami Morales
Julian Fierrez
Alfonso Ortega
Ainhoa Herrarte
Manuel Alcántara
J. Ortega-Garcia
FaML
25
35
0
13 Feb 2023
PAMI: partition input and aggregate outputs for model interpretation
PAMI: partition input and aggregate outputs for model interpretation
Wei Shi
Wentao Zhang
Weishi Zheng
Ruixuan Wang
FAtt
26
3
0
07 Feb 2023
Interpreting Robustness Proofs of Deep Neural Networks
Interpreting Robustness Proofs of Deep Neural Networks
Debangshu Banerjee
Avaljot Singh
Gagandeep Singh
AAML
29
5
0
31 Jan 2023
Open Problems in Applied Deep Learning
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
44
2
0
26 Jan 2023
Towards NeuroAI: Introducing Neuronal Diversity into Artificial Neural
  Networks
Towards NeuroAI: Introducing Neuronal Diversity into Artificial Neural Networks
Fenglei Fan
Yingxin Li
Hanchuan Peng
T. Zeng
Fei Wang
25
5
0
23 Jan 2023
Does Localization Inform Editing? Surprising Differences in
  Causality-Based Localization vs. Knowledge Editing in Language Models
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
Peter Hase
Joey Tianyi Zhou
Been Kim
Asma Ghandeharioun
MILM
48
167
0
10 Jan 2023
Correspondence Distillation from NeRF-based GAN
Correspondence Distillation from NeRF-based GAN
Yushi Lan
Chen Change Loy
Bo Dai
38
9
0
19 Dec 2022
Structure-Guided Image Completion with Image-level and Object-level
  Semantic Discriminators
Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators
Haitian Zheng
Zhe-nan Lin
Jingwan Lu
Scott D. Cohen
Eli Shechtman
...
Jianming Zhang
Qing Liu
Yuqian Zhou
Sohrab Amirghodsi
Jiebo Luo
DiffM
28
1
0
13 Dec 2022
On the Complexity of Bayesian Generalization
On the Complexity of Bayesian Generalization
Yuge Shi
Manjie Xu
J. Hopcroft
Kun He
J. Tenenbaum
Song-Chun Zhu
Ying Nian Wu
Wenjuan Han
Yixin Zhu
30
4
0
20 Nov 2022
Data-Centric Debugging: mitigating model failures via targeted data
  collection
Data-Centric Debugging: mitigating model failures via targeted data collection
Sahil Singla
Atoosa Malemir Chegini
Mazda Moayeri
Soheil Feiz
27
4
0
17 Nov 2022
Finding Skill Neurons in Pre-trained Transformer-based Language Models
Finding Skill Neurons in Pre-trained Transformer-based Language Models
Xiaozhi Wang
Kaiyue Wen
Zhengyan Zhang
Lei Hou
Zhiyuan Liu
Juanzi Li
MILM
MoE
29
51
0
14 Nov 2022
Emergence of Concepts in DNNs?
Emergence of Concepts in DNNs?
Tim Räz
21
0
0
11 Nov 2022
An Interactive Interpretability System for Breast Cancer Screening with
  Deep Learning
An Interactive Interpretability System for Breast Cancer Screening with Deep Learning
Yuzhe Lu
Adam Perer
26
3
0
30 Sep 2022
NeuCEPT: Locally Discover Neural Networks' Mechanism via Critical
  Neurons Identification with Precision Guarantee
NeuCEPT: Locally Discover Neural Networks' Mechanism via Critical Neurons Identification with Precision Guarantee
Minh Nhat Vu
Truc D. T. Nguyen
My T. Thai
AAML
27
3
0
18 Sep 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of
  Deep Neural Networks
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Tilman Raukur
A. Ho
Stephen Casper
Dylan Hadfield-Menell
AAML
AI4CE
28
124
0
27 Jul 2022
Debiasing Deep Chest X-Ray Classifiers using Intra- and Post-processing
  Methods
Debiasing Deep Chest X-Ray Classifiers using Intra- and Post-processing Methods
Ricards Marcinkevics
Ece Ozkan
Julia E. Vogt
30
18
0
26 Jul 2022
Activation Template Matching Loss for Explainable Face Recognition
Activation Template Matching Loss for Explainable Face Recognition
Huawei Lin
Haozhe Liu
Qiufu Li
Linlin Shen
CVBM
29
1
0
05 Jul 2022
Interpretability, Then What? Editing Machine Learning Models to Reflect
  Human Knowledge and Values
Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values
Zijie J. Wang
Alex Kale
Harsha Nori
P. Stella
M. Nunnally
Duen Horng Chau
Mihaela Vorvoreanu
J. W. Vaughan
R. Caruana
KELM
69
27
0
30 Jun 2022
From Attribution Maps to Human-Understandable Explanations through
  Concept Relevance Propagation
From Attribution Maps to Human-Understandable Explanations through Concept Relevance Propagation
Reduan Achtibat
Maximilian Dreyer
Ilona Eisenbraun
S. Bosse
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
FAtt
36
134
0
07 Jun 2022
DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific
  Visualization
DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization
Chaoli Wang
J. Han
41
36
0
13 Apr 2022
12
Next