What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

21 December 2018

Papers citing "What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models"

41 / 41 papers shown

Title
Discovering Influential Neuron Path in Vision Transformers Yifan Wang Yifei Liu Yingdong Shi Chong Li Anqi Pang Sibei Yang Jingyi Yu Kan Ren ViT 69 0 0 12 Mar 2025
Explainable Neural Networks with Guarantees: A Sparse Estimation Approach Antoine Ledent Peng Liu FAtt 109 0 0 20 Feb 2025
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes Bryan R Christ Zack Gottesman Jonathan Kropko Thomas Hartvigsen LRM 57 3 0 20 Feb 2025
How Do Artificial Intelligences Think? The Three Mathematico-Cognitive Factors of Categorical Segmentation Operated by Synthetic Neurons Michael Pichat William Pogrund Armanush Gasparian Paloma Pichat Samuel Demarchi Michael Veillet-Guillem 42 3 0 26 Dec 2024
Linguistically Grounded Analysis of Language Models using Shapley Head Values Marcell Richard Fekete Johannes Bjerva 31 0 0 17 Oct 2024
Crafting Large Language Models for Enhanced Interpretability Chung-En Sun Tuomas P. Oikarinen Tsui-Wei Weng 38 6 0 05 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models Daking Rai Yilun Zhou Shi Feng Abulhair Saparov Ziyu Yao 82 19 0 02 Jul 2024
Graphical Perception of Saliency-based Model Explanations Yayan Zhao Mingwei Li Matthew Berger XAI FAtt 49 2 0 11 Jun 2024
A Multimodal Automated Interpretability Agent Tamar Rott Shaham Sarah Schwettmann Franklin Wang Achyuta Rajaram Evan Hernandez Jacob Andreas Antonio Torralba 34 18 0 22 Apr 2024
Explaining black box text modules in natural language with language models Chandan Singh Aliyah R. Hsu Richard Antonello Shailee Jain Alexander G. Huth Bin-Xia Yu Jianfeng Gao MILM 34 47 0 17 May 2023
Redundancy and Concept Analysis for Code-trained Language Models Arushi Sharma Zefu Hu Christopher Quinn Ali Jannesari 73 1 0 01 May 2023
N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models Alex Foote Neel Nanda Esben Kran Ionnis Konstas Fazl Barez MILM 28 2 0 22 Apr 2023
NxPlain: Web-based Tool for Discovery of Latent Concepts Fahim Dalvi Nadir Durrani Hassan Sajjad Tamim Jaban Musab Husaini Ummar Abbas 15 1 0 06 Mar 2023
Finding Skill Neurons in Pre-trained Transformer-based Language Models Xiaozhi Wang Kaiyue Wen Zhengyan Zhang Lei Hou Zhiyuan Liu Juanzi Li MILM MoE 27 50 0 14 Nov 2022
ConceptX: A Framework for Latent Concept Analysis Firoj Alam Fahim Dalvi Nadir Durrani Hassan Sajjad A. Khan Jia Xu 33 5 0 12 Nov 2022
Impact of Adversarial Training on Robustness and Generalizability of Language Models Enes Altinisik Hassan Sajjad Husrev Taha Sencar Safa Messaoud Sanjay Chawla AAML 21 8 0 10 Nov 2022
On the Transformation of Latent Space in Fine-Tuned NLP Models Nadir Durrani Hassan Sajjad Fahim Dalvi Firoj Alam 32 18 0 23 Oct 2022
Analyzing Transformers in Embedding Space Guy Dar Mor Geva Ankit Gupta Jonathan Berant 24 83 0 06 Sep 2022
Survey: Exploiting Data Redundancy for Optimization of Deep Learning Jou-An Chen Wei Niu Bin Ren Yanzhi Wang Xipeng Shen 23 24 0 29 Aug 2022
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis Xuanyuan Han Pietro Barbiero Dobrik Georgiev Lucie Charlotte Magister Pietro Lió MILM 37 41 0 22 Aug 2022
Discovering Latent Concepts Learned in BERT Fahim Dalvi A. Khan Firoj Alam Nadir Durrani Jia Xu Hassan Sajjad SSL 11 56 0 15 May 2022
Interpreting Arabic Transformer Models Ahmed Abdelali Nadir Durrani Fahim Dalvi Hassan Sajjad 41 2 0 19 Jan 2022
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning Utku Evci Vincent Dumoulin Hugo Larochelle Michael C. Mozer 30 83 0 10 Jan 2022
On the Pitfalls of Analyzing Individual Neurons in Language Models Omer Antverg Yonatan Belinkov MILM 27 49 0 14 Oct 2021
Neuron-level Interpretation of Deep NLP Models: A Survey Hassan Sajjad Nadir Durrani Fahim Dalvi MILM AI4CE 35 80 0 30 Aug 2021
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis Shammur A. Chowdhury Nadir Durrani Ahmed M. Ali 41 12 0 01 Jul 2021
How transfer learning impacts linguistic knowledge in deep NLP models? Nadir Durrani Hassan Sajjad Fahim Dalvi 13 48 0 31 May 2021
An Interpretability Illusion for BERT Tolga Bolukbasi Adam Pearce Ann Yuan Andy Coenen Emily Reif Fernanda Viégas Martin Wattenberg MILM FAtt 40 68 0 14 Apr 2021
Local Interpretations for Explainable Natural Language Processing: A Survey Siwen Luo Hamish Ivison S. Han Josiah Poon MILM 33 48 0 20 Mar 2021
Transformer Feed-Forward Layers Are Key-Value Memories Mor Geva R. Schuster Jonathan Berant Omer Levy KELM 39 745 0 29 Dec 2020
Positional Artefacts Propagate Through Masked Language Model Embeddings Ziyang Luo Artur Kulmizev Xiaoxi Mao 29 41 0 09 Nov 2020
How Do You Act? An Empirical Study to Understand Behavior of Deep Reinforcement Learning Agents Richard Meyes Moritz Schneider Tobias Meisen 28 2 0 07 Apr 2020
Under the Hood of Neural Networks: Characterizing Learned Representations by Functional Neuron Populations and Network Ablations Richard Meyes Constantin Waubert de Puiseau Andres Felipe Posada-Moreno Tobias Meisen AI4CE 30 22 0 02 Apr 2020
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs Matthew L. Leavitt Ari S. Morcos 58 33 0 03 Mar 2020
Neural Machine Translation: A Review and Survey Felix Stahlberg 3DV AI4TS MedIm 20 311 0 04 Dec 2019
On the Linguistic Representational Power of Neural Machine Translation Models Yonatan Belinkov Nadir Durrani Fahim Dalvi Hassan Sajjad James R. Glass MILM 33 68 0 01 Nov 2019
Investigating Multilingual NMT Representations at Scale Sneha Kudugunta Ankur Bapna Isaac Caswell N. Arivazhagan Orhan Firat LRM 144 120 0 05 Sep 2019
Visual Interaction with Deep Learning Models through Collaborative Semantic Inference Sebastian Gehrmann Hendrik Strobelt Robert Krüger Hanspeter Pfister Alexander M. Rush HAI 21 57 0 24 Jul 2019
On the Realization of Compositionality in Neural Networks Joris Baan Jana Leible Mitja Nikolaus David Rau Dennis Ulmer Tim Baumgärtner Dieuwke Hupkes Elia Bruni CoGe 21 16 0 04 Jun 2019
Revisiting the Importance of Individual Units in CNNs via Ablation Bolei Zhou Yiyou Sun David Bau Antonio Torralba FAtt 59 116 0 07 Jun 2018
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 201 882 0 03 May 2018