ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.05219
  4. Cited By
Decoding Layer Saliency in Language Transformers

Decoding Layer Saliency in Language Transformers

9 August 2023
Elizabeth M. Hou
Greg Castañón
    MILM
ArXiv (abs)PDFHTML

Papers citing "Decoding Layer Saliency in Language Transformers"

47 / 47 papers shown
Title
Thermostat: A Large Collection of NLP Model Explanations and Analysis
  Tools
Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools
Nils Feldhus
Robert Schwarzenberg
Sebastian Möller
110
14
0
31 Aug 2021
On the Expressive Power of Self-Attention Matrices
On the Expressive Power of Self-Attention Matrices
Valerii Likhosherstov
K. Choromanski
Adrian Weller
84
36
0
07 Jun 2021
The Out-of-Distribution Problem in Explainability and Search Methods for
  Feature Importance Explanations
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations
Peter Hase
Harry Xie
Joey Tianyi Zhou
OODDLRMFAtt
85
91
0
01 Jun 2021
Generic Attention-model Explainability for Interpreting Bi-Modal and
  Encoder-Decoder Transformers
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers
Hila Chefer
Shir Gur
Lior Wolf
ViT
64
320
0
29 Mar 2021
Transformer Interpretability Beyond Attention Visualization
Transformer Interpretability Beyond Attention Visualization
Hila Chefer
Shir Gur
Lior Wolf
137
673
0
17 Dec 2020
The elephant in the interpretability room: Why use attention as
  explanation when we have saliency methods?
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings
Katja Filippova
XAILRM
95
178
0
12 Oct 2020
How do Decisions Emerge across Layers in Neural Models? Interpretation
  with Differentiable Masking
How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking
Nicola De Cao
Michael Schlichtkrull
Wilker Aziz
Ivan Titov
42
92
0
30 Apr 2020
Self-Attention Attribution: Interpreting Information Interactions Inside
  Transformer
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
Y. Hao
Li Dong
Furu Wei
Ke Xu
ViT
80
225
0
23 Apr 2020
A Primer in BERTology: What we know about how BERT works
A Primer in BERTology: What we know about how BERT works
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
92
1,498
0
27 Feb 2020
Attention Interpretability Across NLP Tasks
Attention Interpretability Across NLP Tasks
Shikhar Vashishth
Shyam Upadhyay
Gaurav Singh Tomar
Manaal Faruqui
XAIMILM
93
176
0
24 Sep 2019
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
Eric Wallace
Jens Tuyls
Junlin Wang
Sanjay Subramanian
Matt Gardner
Sameer Singh
MILM
68
138
0
19 Sep 2019
Learning to Deceive with Attention-Based Explanations
Learning to Deceive with Attention-Based Explanations
Danish Pruthi
Mansi Gupta
Bhuwan Dhingra
Graham Neubig
Zachary Chase Lipton
80
193
0
17 Sep 2019
Transformer Dissection: A Unified Understanding of Transformer's
  Attention via the Lens of Kernel
Transformer Dissection: A Unified Understanding of Transformer's Attention via the Lens of Kernel
Yao-Hung Hubert Tsai
Shaojie Bai
M. Yamada
Louis-Philippe Morency
Ruslan Salakhutdinov
124
261
0
30 Aug 2019
Revealing the Dark Secrets of BERT
Revealing the Dark Secrets of BERT
Olga Kovaleva
Alexey Romanov
Anna Rogers
Anna Rumshisky
38
554
0
21 Aug 2019
Attention is not not Explanation
Attention is not not Explanation
Sarah Wiegreffe
Yuval Pinter
XAIAAMLFAtt
120
914
0
13 Aug 2019
Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique
  for Deep Convolutional Neural Network Models
Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models
Daniel Omeiza
Skyler Speakman
C. Cintas
Komminist Weldemariam
FAtt
57
218
0
03 Aug 2019
A Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer Model
Jesse Vig
ViT
79
582
0
12 Jun 2019
Is Attention Interpretable?
Is Attention Interpretable?
Sofia Serrano
Noah A. Smith
108
684
0
09 Jun 2019
Open Sesame: Getting Inside BERT's Linguistic Knowledge
Open Sesame: Getting Inside BERT's Linguistic Knowledge
Yongjie Lin
Y. Tan
Robert Frank
60
287
0
04 Jun 2019
Are Sixteen Heads Really Better than One?
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
105
1,068
0
25 May 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy
  Lifting, the Rest Can Be Pruned
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
117
1,146
0
23 May 2019
BERT Rediscovers the Classical NLP Pipeline
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILMSSeg
138
1,478
0
15 May 2019
Generating Token-Level Explanations for Natural Language Inference
Generating Token-Level Explanations for Natural Language Inference
James Thorne
Andreas Vlachos
Christos Christodoulopoulos
Arpit Mittal
LRM
77
57
0
24 Apr 2019
Attention is not Explanation
Attention is not Explanation
Sarthak Jain
Byron C. Wallace
FAtt
145
1,328
0
26 Feb 2019
Understanding Individual Decisions of CNNs via Contrastive
  Backpropagation
Understanding Individual Decisions of CNNs via Contrastive Backpropagation
Jindong Gu
Yinchong Yang
Volker Tresp
FAtt
58
97
0
05 Dec 2018
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Hendrik Strobelt
Sebastian Gehrmann
M. Behrisch
Adam Perer
Hanspeter Pfister
Alexander M. Rush
VLMHAI
54
240
0
25 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,196
0
20 Apr 2018
Explaining Recurrent Neural Network Predictions in Sentiment Analysis
Explaining Recurrent Neural Network Predictions in Sentiment Analysis
L. Arras
G. Montavon
K. Müller
Wojciech Samek
FAtt
61
354
0
22 Jun 2017
SmoothGrad: removing noise by adding noise
SmoothGrad: removing noise by adding noise
D. Smilkov
Nikhil Thorat
Been Kim
F. Viégas
Martin Wattenberg
FAttODL
207
2,235
0
12 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
730
132,363
0
12 Jun 2017
A Unified Approach to Interpreting Model Predictions
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
1.1K
22,002
0
22 May 2017
Interpretable Explanations of Black Boxes by Meaningful Perturbation
Interpretable Explanations of Black Boxes by Meaningful Perturbation
Ruth C. Fong
Andrea Vedaldi
FAttAAML
76
1,525
0
11 Apr 2017
Learning Important Features Through Propagating Activation Differences
Learning Important Features Through Propagating Activation Differences
Avanti Shrikumar
Peyton Greenside
A. Kundaje
FAtt
203
3,881
0
10 Apr 2017
Axiomatic Attribution for Deep Networks
Axiomatic Attribution for Deep Networks
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OODFAtt
191
6,018
0
04 Mar 2017
Understanding Neural Networks through Representation Erasure
Understanding Neural Networks through Representation Erasure
Jiwei Li
Will Monroe
Dan Jurafsky
AAMLMILM
93
567
0
24 Dec 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based
  Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
321
20,070
0
07 Oct 2016
Top-down Neural Attention by Excitation Backprop
Top-down Neural Attention by Excitation Backprop
Jianming Zhang
Zhe Lin
Jonathan Brandt
Xiaohui Shen
Stan Sclaroff
92
948
0
01 Aug 2016
Representation of linguistic form and function in recurrent neural
  networks
Representation of linguistic form and function in recurrent neural networks
Ákos Kádár
Grzegorz Chrupała
Afra Alishahi
65
162
0
29 Feb 2016
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Marco Tulio Ribeiro
Sameer Singh
Carlos Guestrin
FAttFaML
1.2K
17,027
0
16 Feb 2016
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label
  Classification
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification
André F. T. Martins
Ramón Fernández Astudillo
184
726
0
05 Feb 2016
Learning Deep Features for Discriminative Localization
Learning Deep Features for Discriminative Localization
Bolei Zhou
A. Khosla
Àgata Lapedriza
A. Oliva
Antonio Torralba
SSLSSegFAtt
253
9,338
0
14 Dec 2015
Character-level Convolutional Networks for Text Classification
Character-level Convolutional Networks for Text Classification
Xiang Zhang
Jiaqi Zhao
Yann LeCun
268
6,130
0
04 Sep 2015
Learning Deconvolution Network for Semantic Segmentation
Learning Deconvolution Network for Semantic Segmentation
Hyeonwoo Noh
Seunghoon Hong
Bohyung Han
SSeg
235
4,180
0
17 May 2015
Striving for Simplicity: The All Convolutional Net
Striving for Simplicity: The All Convolutional Net
Jost Tobias Springenberg
Alexey Dosovitskiy
Thomas Brox
Martin Riedmiller
FAtt
251
4,681
0
21 Dec 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
575
27,325
0
01 Sep 2014
Deep Inside Convolutional Networks: Visualising Image Classification
  Models and Saliency Maps
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
Karen Simonyan
Andrea Vedaldi
Andrew Zisserman
FAtt
314
7,316
0
20 Dec 2013
Visualizing and Understanding Convolutional Networks
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler
Rob Fergus
FAttSSL
595
15,902
0
12 Nov 2013
1