Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.04840
Cited By
Post-hoc Interpretability for Neural NLP: A Survey
10 August 2021
Andreas Madsen
Siva Reddy
A. Chandar
XAI
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Post-hoc Interpretability for Neural NLP: A Survey"
47 / 47 papers shown
Title
Display Content, Display Methods and Evaluation Methods of the HCI in Explainable Recommender Systems: A Survey
Weiqing Li
Yue Xu
Yuefeng Li
Yinghui Huang
23
0
0
14 May 2025
Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification
Leon Eshuijs
Shihan Wang
Antske Fokkens
26
0
0
09 May 2025
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Mahdi Dhaini
Ege Erdogan
Nils Feldhus
Gjergji Kasneci
49
0
0
02 May 2025
Superscopes: Amplifying Internal Feature Representations for Language Model Interpretation
Jonathan Jacobi
Gal Niv
LRM
ReLM
60
0
0
03 Mar 2025
Concept Layers: Enhancing Interpretability and Intervenability via LLM Conceptualization
Or Raphael Bidusa
Shaul Markovitch
61
0
0
20 Feb 2025
FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation
Qianli Wang
Nils Feldhus
Simon Ostermann
Luis Felipe Villa-Arenas
Sebastian Möller
Vera Schmitt
AAML
34
0
0
01 Jan 2025
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Dennis Fucci
Marco Gaido
Beatrice Savoldi
Matteo Negri
Mauro Cettolo
L. Bentivogli
54
1
0
03 Nov 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
40
10
0
27 Jul 2024
Latent Concept-based Explanation of NLP Models
Xuemin Yu
Fahim Dalvi
Nadir Durrani
Marzia Nouri
Hassan Sajjad
LRM
FAtt
29
1
0
18 Apr 2024
Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending
Mario Sanz-Guerrero
Javier Arroyo
28
4
0
29 Jan 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Asma Ghandeharioun
Avi Caciularu
Adam Pearce
Lucas Dixon
Mor Geva
34
87
0
11 Jan 2024
HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model for online comments
Neeraj Kumar Singh
Koyel Ghosh
Joy Mahapatra
Utpal Garain
Apurbalal Senapati
11
0
0
20 Dec 2023
Interpreting Pretrained Language Models via Concept Bottlenecks
Zhen Tan
Lu Cheng
Song Wang
Yuan Bo
Wenlin Yao
Huan Liu
LRM
32
20
0
08 Nov 2023
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin
Mohammad Taufeeque
Noah D. Goodman
32
27
0
26 Oct 2023
InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations
Nils Feldhus
Qianli Wang
Tatiana Anikina
Sahil Chopra
Cennet Oguz
Sebastian Möller
32
9
0
09 Oct 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
38
158
0
02 Jun 2023
Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition
Xiao-lan Wu
P. Bell
A. Rajan
19
5
0
29 May 2023
Computational modeling of semantic change
Nina Tahmasebi
Haim Dubossarsky
34
6
0
13 Apr 2023
Multi-resolution Interpretation and Diagnostics Tool for Natural Language Classifiers
P. Jalali
Nengfeng Zhou
Yufei Yu
AAML
33
0
0
06 Mar 2023
IFAN: An Explainability-Focused Interaction Framework for Humans and NLP Models
Edoardo Mosca
Daryna Dementieva
Tohid Ebrahim Ajdari
Maximilian Kummeth
Kirill Gringauz
Yutong Zhou
Georg Groh
24
8
0
06 Mar 2023
Explanations for Automatic Speech Recognition
Xiao-lan Wu
P. Bell
A. Rajan
6
6
0
27 Feb 2023
A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries
P. Kudva
R. Bordawekar
Apoorva Nitsure
17
0
0
23 Feb 2023
Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations
Valerie Chen
Q. V. Liao
Jennifer Wortman Vaughan
Gagan Bansal
41
104
0
18 Jan 2023
Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and Evaluation
O. Serikov
Vitaly Protasov
E. Voloshina
V. Knyazkova
Tatiana Shavrina
35
3
0
24 Oct 2022
Explainable Causal Analysis of Mental Health on Social Media Data
Chandni Saxena
Muskan Garg
G. Saxena
CML
29
8
0
16 Oct 2022
Review of Natural Language Processing in Pharmacology
D. Trajanov
Vangel Trajkovski
Makedonka Dimitrieva
Jovana Dobreva
Milos Jovanovik
Matej Klemen
Alevs vZagar
Marko Robnik-vSikonja
LM&MA
23
7
0
22 Aug 2022
ferret: a Framework for Benchmarking Explainers on Transformers
Giuseppe Attanasio
Eliana Pastor
C. Bonaventura
Debora Nozza
33
30
0
02 Aug 2022
Is Attention Interpretation? A Quantitative Assessment On Sets
Jonathan Haab
N. Deutschmann
María Rodríguez Martínez
19
7
0
26 Jul 2022
Mediators: Conversational Agents Explaining NLP Model Behavior
Nils Feldhus
A. Ravichandran
Sebastian Möller
32
16
0
13 Jun 2022
Interactive Model Cards: A Human-Centered Approach to Model Documentation
Anamaria Crisan
Margaret Drouhard
Jesse Vig
Nazneen Rajani
HAI
30
87
0
05 May 2022
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
34
10
0
31 Mar 2022
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
23
49
0
08 Mar 2022
Interpreting Language Models with Contrastive Explanations
Kayo Yin
Graham Neubig
MILM
21
77
0
21 Feb 2022
"Will You Find These Shortcuts?" A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification
Jasmijn Bastings
Sebastian Ebert
Polina Zablotskaia
Anders Sandholm
Katja Filippova
115
75
0
14 Nov 2021
Explainable AI (XAI): A Systematic Meta-Survey of Current Challenges and Future Opportunities
Waddah Saeed
C. Omlin
XAI
36
415
0
11 Nov 2021
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
Andreas Madsen
Nicholas Meade
Vaibhav Adlakha
Siva Reddy
103
35
0
15 Oct 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
226
405
0
24 Feb 2021
UnNatural Language Inference
Koustuv Sinha
Prasanna Parthasarathi
Joelle Pineau
Adina Williams
216
94
0
30 Dec 2020
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations
Samson Tan
Shafiq R. Joty
Min-Yen Kan
R. Socher
166
103
0
09 May 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
246
4,489
0
23 Jan 2020
A Survey on Bias and Fairness in Machine Learning
Ninareh Mehrabi
Fred Morstatter
N. Saxena
Kristina Lerman
Aram Galstyan
SyDa
FaML
323
4,212
0
23 Aug 2019
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
257
620
0
04 Dec 2018
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
A causal framework for explaining the predictions of black-box sequence-to-sequence models
David Alvarez-Melis
Tommi Jaakkola
CML
232
201
0
06 Jul 2017
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
254
3,684
0
28 Feb 2017
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
245
31,257
0
16 Jan 2013
1