Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.01029
Cited By
v1
v2
v3 (latest)
Explainability for Large Language Models: A Survey
2 September 2023
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jundong Li
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Explainability for Large Language Models: A Survey"
19 / 119 papers shown
Title
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
151
1,482
0
15 May 2019
On Attribution of Recurrent Neural Network Predictions via Additive Decomposition
Mengnan Du
Ninghao Liu
Fan Yang
Shuiwang Ji
Helen Zhou
FAtt
60
50
0
27 Mar 2019
Attention is not Explanation
Sarthak Jain
Byron C. Wallace
FAtt
155
1,330
0
26 Feb 2019
What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models
Fahim Dalvi
Nadir Durrani
Hassan Sajjad
Yonatan Belinkov
A. Bau
James R. Glass
MILM
72
192
0
21 Dec 2018
An Introductory Survey on Attention Mechanisms in NLP Problems
Dichao Hu
AIMat
80
247
0
12 Nov 2018
Targeted Syntactic Evaluation of Language Models
Rebecca Marvin
Tal Linzen
94
417
0
27 Aug 2018
Dissecting Contextual Word Embeddings: Architecture and Representation
Matthew E. Peters
Mark Neumann
Luke Zettlemoyer
Wen-tau Yih
111
433
0
27 Aug 2018
Techniques for Interpretable Machine Learning
Mengnan Du
Ninghao Liu
Helen Zhou
FaML
95
1,092
0
31 Jul 2018
Deep RNNs Encode Soft Hierarchical Syntax
Terra Blevins
Omer Levy
Luke Zettlemoyer
90
111
0
11 May 2018
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Hendrik Strobelt
Sebastian Gehrmann
M. Behrisch
Adam Perer
Hanspeter Pfister
Alexander M. Rush
VLM
HAI
63
240
0
25 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,201
0
20 Apr 2018
Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks
Yonatan Belinkov
Lluís Màrquez i Villodre
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
James R. Glass
77
165
0
23 Jan 2018
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
Been Kim
Martin Wattenberg
Justin Gilmer
Carrie J. Cai
James Wexler
F. Viégas
Rory Sayres
FAtt
252
1,848
0
30 Nov 2017
The (Un)reliability of saliency methods
Pieter-Jan Kindermans
Sara Hooker
Julius Adebayo
Maximilian Alber
Kristof T. Schütt
Sven Dähne
D. Erhan
Been Kim
FAtt
XAI
109
689
0
02 Nov 2017
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
1.1K
22,135
0
22 May 2017
Understanding Black-box Predictions via Influence Functions
Pang Wei Koh
Percy Liang
TDI
229
2,910
0
14 Mar 2017
Axiomatic Attribution for Deep Networks
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OOD
FAtt
211
6,027
0
04 Mar 2017
Understanding Neural Networks through Representation Erasure
Jiwei Li
Will Monroe
Dan Jurafsky
AAML
MILM
107
567
0
24 Dec 2016
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Marco Tulio Ribeiro
Sameer Singh
Carlos Guestrin
FAtt
FaML
1.2K
17,092
0
16 Feb 2016
Previous
1
2
3