Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04341
Cited By
What Does BERT Look At? An Analysis of BERT's Attention
11 June 2019
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"What Does BERT Look At? An Analysis of BERT's Attention"
50 / 886 papers shown
Title
The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders
Han He
Jinho Choi
56
87
0
14 Sep 2021
Attention Weights in Transformer NMT Fail Aligning Words Between Sequences but Largely Explain Model Predictions
Javier Ferrando
Marta R. Costa-jussá
22
13
0
13 Sep 2021
GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks
Weicheng Ma
Renze Lou
Kai Zhang
Lili Wang
Soroush Vosoughi
23
8
0
13 Sep 2021
Artificial Text Detection via Examining the Topology of Attention Maps
Laida Kushnareva
D. Cherniavskii
Vladislav Mikhailov
Ekaterina Artemova
S. Barannikov
A. Bernstein
Irina Piontkovskaya
D. Piontkovski
Evgeny Burnaev
51
49
0
10 Sep 2021
Sparsity and Sentence Structure in Encoder-Decoder Attention of Summarization Systems
Potsawee Manakul
Mark Gales
21
5
0
08 Sep 2021
Eliminating Sentiment Bias for Aspect-Level Sentiment Classification with Unsupervised Opinion Extraction
Bo Wang
Tao Shen
Guodong Long
Dinesh Manocha
Yi-Ju Chang
14
25
0
06 Sep 2021
CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models
Arjun Reddy Akula
Keze Wang
Changsong Liu
Sari Saba-Sadiya
Hongjing Lu
S. Todorovic
J. Chai
Song-Chun Zhu
31
47
0
03 Sep 2021
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting
Xiang Chen
Lei Li
Shumin Deng
Chuanqi Tan
Changliang Xu
Fei Huang
Luo Si
Huajun Chen
Ningyu Zhang
VLM
42
65
0
31 Aug 2021
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Linyang Li
Demin Song
Xiaonan Li
Jiehang Zeng
Ruotian Ma
Xipeng Qiu
33
135
0
31 Aug 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
G. Chrysostomou
Nikolaos Aletras
32
16
0
31 Aug 2021
T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP
Raymond Li
Wen Xiao
Lanjun Wang
Hyeju Jang
Giuseppe Carenini
ViT
31
23
0
31 Aug 2021
Legal Search in Case Law and Statute Law
Julien Rossi
Evangelos Kanoulas
AILaw
ELM
122
8
0
23 Aug 2021
VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case Law
Julien Rossi
Svitlana Vakulenko
Evangelos Kanoulas
AILaw
25
2
0
23 Aug 2021
Contributions of Transformer Attention Heads in Multi- and Cross-lingual Tasks
Weicheng Ma
Kai Zhang
Renze Lou
Lili Wang
Soroush Vosoughi
196
15
0
18 Aug 2021
Post-hoc Interpretability for Neural NLP: A Survey
Andreas Madsen
Siva Reddy
A. Chandar
XAI
27
225
0
10 Aug 2021
Differentiable Subset Pruning of Transformer Heads
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
45
53
0
10 Aug 2021
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification
Yiding Jiang
Bidisha Sharma
Maulik C. Madhavi
Haizhou Li
36
25
0
05 Aug 2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
29
35
0
05 Aug 2021
Structural Guidance for Transformer Language Models
Peng Qian
Tahira Naseem
R. Levy
Ramón Fernández Astudillo
47
31
0
30 Jul 2021
Graph-free Multi-hop Reading Comprehension: A Select-to-Guide Strategy
Bohong Wu
ZhuoSheng Zhang
Hai Zhao
32
20
0
25 Jul 2021
Multi-Stream Transformers
Andrey Kravchenko
Anna Rumshisky
AI4CE
14
0
0
21 Jul 2021
Human Attention during Goal-directed Reading Comprehension Relies on Task Optimization
Jiajie Zou
Yuran Zhang
Jialu Li
Xing Tian
Nai Ding
AIMat
38
2
0
13 Jul 2021
Hate versus Politics: Detection of Hate against Policy makers in Italian tweets
Armend Duzha
Cristiano Casadei
Michael Tosi
Fabio Celli
25
6
0
12 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
49
259
0
01 Jul 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
35
50
0
23 Jun 2021
It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning
Alexey Tikhonov
Max Ryabinin
LRM
18
57
0
22 Jun 2021
Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation
Srinadh Bhojanapalli
Ayan Chakrabarti
Himanshu Jain
Sanjiv Kumar
Michal Lukasik
Andreas Veit
26
8
0
16 Jun 2021
What Context Features Can Transformer Language Models Use?
J. O'Connor
Jacob Andreas
KELM
29
75
0
15 Jun 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
58
816
0
14 Jun 2021
Thinking Like Transformers
Gail Weiss
Yoav Goldberg
Eran Yahav
AI4CE
35
128
0
13 Jun 2021
Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation
Yuanxin Liu
Fandong Meng
Zheng Lin
Weiping Wang
Jie Zhou
21
6
0
10 Jun 2021
Neural Supervised Domain Adaptation by Augmenting Pre-trained Models with Random Units
Sara Meftah
N. Semmar
Y. Tamaazousti
H. Essafi
F. Sadat
20
3
0
09 Jun 2021
On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
Wei Zhang
Ziming Huang
Yada Zhu
Guangnan Ye
Xiaodong Cui
Fan Zhang
31
17
0
09 Jun 2021
Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning
Piotr Pikekos
Henryk Michalewski
Mateusz Malinowski
32
28
0
07 Jun 2021
Attend and select: A segment selective transformer for microblog hashtag generation
Qianren Mao
Xi Li
Bang Liu
Shu Guo
Peng Hao
Jianxin Li
Lihong Wang
31
3
0
06 Jun 2021
Causal Abstractions of Neural Networks
Atticus Geiger
Hanson Lu
Thomas Icard
Christopher Potts
NAI
CML
17
222
0
06 Jun 2021
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers
Ximing Lu
Jack Hessel
Youngjae Yu
J. S. Park
Jize Cao
Ali Farhadi
Yejin Choi
VLM
LRM
33
372
0
04 Jun 2021
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models
Ulme Wennberg
G. Henter
MILM
35
21
0
03 Jun 2021
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis
Joshua Forster Feinglass
Yezhou Yang
24
21
0
02 Jun 2021
On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers
Tianchu Ji
Shraddhan Jain
M. Ferdman
Peter Milder
H. Andrew Schwartz
Niranjan Balasubramanian
MQ
58
15
0
02 Jun 2021
Implicit Representations of Meaning in Neural Language Models
Belinda Z. Li
Maxwell Nye
Jacob Andreas
NAI
MILM
21
172
0
01 Jun 2021
Using Integrated Gradients and Constituency Parse Trees to explain Linguistic Acceptability learnt by BERT
Anmol Nayak
Hariprasad Timmapathini
35
4
0
01 Jun 2021
Do Multilingual Neural Machine Translation Models Contain Language Pair Specific Attention Heads?
Zae Myung Kim
Laurent Besacier
Vassilina Nikoulina
D. Schwab
MILM
55
7
0
31 May 2021
Cascaded Head-colliding Attention
Lin Zheng
Zhiyong Wu
Lingpeng Kong
27
2
0
31 May 2021
On the Interplay Between Fine-tuning and Composition in Transformers
Lang-Chi Yu
Allyson Ettinger
33
14
0
31 May 2021
UCPhrase: Unsupervised Context-aware Quality Phrase Tagging
Xiaotao Gu
Zihan Wang
Zhenyu Bi
Yu Meng
Liyuan Liu
Jiawei Han
Jingbo Shang
103
36
0
28 May 2021
Learning Relation Alignment for Calibrated Cross-modal Retrieval
Shuhuai Ren
Junyang Lin
Guangxiang Zhao
Rui Men
An Yang
Jingren Zhou
Xu Sun
Hongxia Yang
28
37
0
28 May 2021
Inspecting the concept knowledge graph encoded by modern language models
Carlos Aspillaga
Marcelo Mendoza
Alvaro Soto
27
13
0
27 May 2021
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
45
765
0
26 May 2021
Context-Sensitive Visualization of Deep Learning Natural Language Processing Models
A. Dunn
Diana Inkpen
Razvan Andonie
19
8
0
25 May 2021
Previous
1
2
3
...
11
12
13
...
16
17
18
Next