ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05714
  4. Cited By
A Multiscale Visualization of Attention in the Transformer Model

A Multiscale Visualization of Attention in the Transformer Model

12 June 2019
Jesse Vig
    ViT
ArXivPDFHTML

Papers citing "A Multiscale Visualization of Attention in the Transformer Model"

50 / 98 papers shown
Title
What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models
What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models
Jan Kapar
Niklas Koenen
Martin Jullum
66
0
0
29 Apr 2025
Discovering Influential Neuron Path in Vision Transformers
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang
Yifei Liu
Yingdong Shi
Chong Li
Anqi Pang
Sibei Yang
Jingyi Yu
Kan Ren
ViT
71
0
0
12 Mar 2025
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Dishanika Denipitiyage
B. Silva
Suranga Seneviratne
A. Seneviratne
Sanjay Chawla
48
0
0
07 Feb 2025
Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention
Zhenyu Guo
Wenguang Chen
48
0
0
01 Jan 2025
On the Role of Attention Heads in Large Language Model Safety
On the Role of Attention Heads in Large Language Model Safety
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Fan Zhang
Yongbin Li
59
5
0
17 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
42
13
0
27 Jul 2024
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization
  for Language Models
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models
Chengzhengxu Li
Xiaoming Liu
Zhaohan Zhang
Yichen Wang
Chen Liu
Y. Lan
Chao Shen
60
2
0
15 Jun 2024
Probing Large Language Models for Scalar Adjective Lexical Semantics and
  Scalar Diversity Pragmatics
Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics
Fangru Lin
Daniel Altshuler
J. Pierrehumbert
38
1
0
04 Apr 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution
  in Large Language Models
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
Carlo Nicolini
Jacopo Staiano
Bruno Lepri
Raffaele Marino
MoE
34
1
0
13 Mar 2024
k* Distribution: Evaluating the Latent Space of Deep Neural Networks
  using Local Neighborhood Analysis
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis
Shashank Kotyan
Tatsuya Ueda
Danilo Vasconcellos Vargas
32
1
0
07 Dec 2023
On the Importance of Step-wise Embeddings for Heterogeneous Clinical
  Time-Series
On the Importance of Step-wise Embeddings for Heterogeneous Clinical Time-Series
Rita Kuznetsova
Alizée Pace
Manuel Burger
Hugo Yèche
Gunnar Rätsch
AI4TS
39
5
0
15 Nov 2023
AMPLIFY:Attention-based Mixup for Performance Improvement and Label
  Smoothing in Transformer
AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer
Leixin Yang
Yu Xiang
31
0
0
22 Sep 2023
Nebula: Self-Attention for Dynamic Malware Analysis
Nebula: Self-Attention for Dynamic Malware Analysis
Dmitrijs Trizna
Christian Scano
Battista Biggio
Fabio Roli
24
13
0
19 Sep 2023
Attention Visualizer Package: Revealing Word Importance for Deeper
  Insight into Encoder-Only Transformer Models
Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models
A. A. Falaki
R. Gras
ViT
26
7
0
28 Aug 2023
Towards the Visualization of Aggregated Class Activation Maps to Analyse
  the Global Contribution of Class Features
Towards the Visualization of Aggregated Class Activation Maps to Analyse the Global Contribution of Class Features
Igor Cherepanov
D. Sessler
Alex Ulmer
Hendrik Lücke-Tieke
Jörn Kohlhammer
FAtt
24
0
0
29 Jul 2023
Zero-Shot Text Classification via Self-Supervised Tuning
Zero-Shot Text Classification via Self-Supervised Tuning
Chaoqun Liu
Wenxuan Zhang
Guizhen Chen
Xiaobao Wu
A. Luu
Chip Hong Chang
Lidong Bing
VLM
55
11
0
19 May 2023
A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain
  Text Classification
A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification
Yunlong Feng
Bohan Li
Libo Qin
Xiao Xu
Wanxiang Che
22
3
0
18 Apr 2023
UKP-SQuARE v3: A Platform for Multi-Agent QA Research
UKP-SQuARE v3: A Platform for Multi-Agent QA Research
Haritz Puerto
Tim Baumgärtner
Rachneet Sachdeva
Haishuo Fang
Haotian Zhang
Sewin Tariverdian
Kexin Wang
Iryna Gurevych
30
2
0
31 Mar 2023
Evaluating self-attention interpretability through human-grounded
  experimental protocol
Evaluating self-attention interpretability through human-grounded experimental protocol
Milan Bhan
Nina Achache
Victor Legrand
A. Blangero
Nicolas Chesneau
26
9
0
27 Mar 2023
How Does Attention Work in Vision Transformers? A Visual Analytics
  Attempt
How Does Attention Work in Vision Transformers? A Visual Analytics Attempt
Yiran Li
Junpeng Wang
Xin Dai
Liang Wang
Chin-Chia Michael Yeh
Yan Zheng
Wei Zhang
Kwan-Liu Ma
ViT
20
24
0
24 Mar 2023
SensePOLAR: Word sense aware interpretability for pre-trained contextual
  word embeddings
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings
Jan Engler
Sandipan Sikdar
Marlene Lutz
M. Strohmaier
32
7
0
11 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models:
  from Data to Inference
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference
R. Brath
Daniel A. Keim
Johannes Knittel
Shimei Pan
Pia Sommerauer
Hendrik Strobelt
19
11
0
11 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
23
25
0
05 Jan 2023
Black-box language model explanation by context length probing
Black-box language model explanation by context length probing
Ondřej Cífka
Antoine Liutkus
MILM
LRM
24
6
0
30 Dec 2022
PCRED: Zero-shot Relation Triplet Extraction with Potential Candidate
  Relation Selection and Entity Boundary Detection
PCRED: Zero-shot Relation Triplet Extraction with Potential Candidate Relation Selection and Entity Boundary Detection
Yuquan Lan
Dongxu Li
Yunqi Zhang
Hui Zhao
Gang Zhao
27
4
0
26 Nov 2022
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight
  BERT
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT
Siyuan Lu
Chenchen Zhou
Keli Xie
Jun Lin
Zhongfeng Wang
29
1
0
16 Nov 2022
Multi-Task Learning Framework for Extracting Emotion Cause Span and
  Entailment in Conversations
Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations
A. Bhat
Ashutosh Modi
35
9
0
07 Nov 2022
MOFormer: Self-Supervised Transformer model for Metal-Organic Framework
  Property Prediction
MOFormer: Self-Supervised Transformer model for Metal-Organic Framework Property Prediction
Zhonglin Cao
Rishikesh Magar
Yuyang Wang
A. Farimani
AI4CE
25
88
0
25 Oct 2022
Hierarchical Multi-Interest Co-Network For Coarse-Grained Ranking
Hierarchical Multi-Interest Co-Network For Coarse-Grained Ranking
Xu Yuan
Chengjun Xu
Qiwei Chen
Tao Zhuang
Hongjie Chen
Chong Li
Junfeng Ge
AI4TS
27
0
0
19 Oct 2022
Explainable Slot Type Attentions to Improve Joint Intent Detection and
  Slot Filling
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling
Kalpa Gunaratna
Vijay Srinivasan
Akhila Yerukola
Hongxia Jin
29
6
0
19 Oct 2022
A Transformer-based deep neural network model for SSVEP classification
A Transformer-based deep neural network model for SSVEP classification
Jianbo Chen
Yangsong Zhang
Yudong Pan
Peng Xu
Cuntai Guan
22
50
0
09 Oct 2022
polyBERT: A chemical language model to enable fully machine-driven
  ultrafast polymer informatics
polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics
Christopher Kuenneth
R. Ramprasad
39
101
0
29 Sep 2022
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias
  Annotations By Experts
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts
Timo Spinde
Manuel Plank
Jan-David Krieger
Terry Ruas
Bela Gipp
Akiko Aizawa
27
69
0
29 Sep 2022
Visual Comparison of Language Model Adaptation
Visual Comparison of Language Model Adaptation
Rita Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
49
16
0
17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation
  with Large Language Models
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
Hendrik Strobelt
Albert Webson
Victor Sanh
Benjamin Hoover
Johanna Beyer
Hanspeter Pfister
Alexander M. Rush
VLM
36
136
0
16 Aug 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of
  Deep Neural Networks
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Tilman Raukur
A. Ho
Stephen Casper
Dylan Hadfield-Menell
AAML
AI4CE
28
124
0
27 Jul 2022
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug
  Labeling to Enhance Product-Specific Guidance Assessment
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug Labeling to Enhance Product-Specific Guidance Assessment
Yiwen Shi
Jing Wang
Ping Ren
Taha ValizadehAslani
Yi Zhang
Meng Hu
Hualou Liang
AI4MH
AAML
24
16
0
25 Jul 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid
  Counterfactual Training for Robust Content-based Image Retrieval
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
55
6
0
09 Jul 2022
Astroconformer: Inferring Surface Gravity of Stars from Stellar Light
  Curves with Transformer
Astroconformer: Inferring Surface Gravity of Stars from Stellar Light Curves with Transformer
Jiashu Pan
Y. Ting 丁
Jie Yu
18
3
0
06 Jul 2022
Ask to Know More: Generating Counterfactual Explanations for Fake Claims
Ask to Know More: Generating Counterfactual Explanations for Fake Claims
Shih-Chieh Dai
Yi-Li Hsu
Aiping Xiong
Lun-Wei Ku
OffRL
25
22
0
10 Jun 2022
Attention Flows for General Transformers
Attention Flows for General Transformers
Niklas Metzger
Christopher Hahn
Julian Siber
Frederik Schmitt
Bernd Finkbeiner
42
0
0
30 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to
  Store Speaker Information
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
31
8
0
08 May 2022
An Exploratory Study on Code Attention in BERT
An Exploratory Study on Code Attention in BERT
Rishab Sharma
Fuxiang Chen
Fatemeh H. Fard
David Lo
27
25
0
05 Apr 2022
Interpretation of Black Box NLP Models: A Survey
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
34
10
0
31 Mar 2022
Scientometric Review of Artificial Intelligence for Operations &
  Maintenance of Wind Turbines: The Past, Present and Future
Scientometric Review of Artificial Intelligence for Operations & Maintenance of Wind Turbines: The Past, Present and Future
Joyjit Chatterjee
Nina Dethlefs
31
83
0
30 Mar 2022
Self-supervised Video-centralised Transformer for Video Face Clustering
Self-supervised Video-centralised Transformer for Video Face Clustering
Yujiang Wang
Mingzhi Dong
Jie Shen
Yi-Si Luo
Yiming Lin
Pingchuan Ma
Stavros Petridis
M. Pantic
ViT
28
3
0
24 Mar 2022
GRS: Combining Generation and Revision in Unsupervised Sentence
  Simplification
GRS: Combining Generation and Revision in Unsupervised Sentence Simplification
Mohammad Dehghan
Dhruv Kumar
Lukasz Golab
29
12
0
18 Mar 2022
Iteratively Prompt Pre-trained Language Models for Chain of Thought
Iteratively Prompt Pre-trained Language Models for Chain of Thought
Boshi Wang
Xiang Deng
Huan Sun
KELM
ReLM
LRM
41
95
0
16 Mar 2022
A Data-scalable Transformer for Medical Image Segmentation:
  Architecture, Model Efficiency, and Benchmark
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark
Yunhe Gao
Mu Zhou
Ding Liu
Zhennan Yan
Shaoting Zhang
Dimitris N. Metaxas
ViT
MedIm
28
68
0
28 Feb 2022
LISA: Learning Interpretable Skill Abstractions from Language
LISA: Learning Interpretable Skill Abstractions from Language
Divyansh Garg
Skanda Vaidyanath
Kuno Kim
Jiaming Song
Stefano Ermon
LM&Ro
OffRL
156
29
0
28 Feb 2022
12
Next