A Multiscale Visualization of Attention in the Transformer Model

12 June 2019

Papers citing "A Multiscale Visualization of Attention in the Transformer Model"

50 / 98 papers shown

Title
What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models Jan Kapar Niklas Koenen Martin Jullum 66 0 0 29 Apr 2025
Discovering Influential Neuron Path in Vision Transformers Yifan Wang Yifei Liu Yingdong Shi Chong Li Anqi Pang Sibei Yang Jingyi Yu Kan Ren ViT 71 0 0 12 Mar 2025
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach Dishanika Denipitiyage B. Silva Suranga Seneviratne A. Seneviratne Sanjay Chawla 48 0 0 07 Feb 2025
Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention Zhenyu Guo Wenguang Chen 48 0 0 01 Jan 2025
On the Role of Attention Heads in Large Language Model Safety Zhenhong Zhou Haiyang Yu Xinghua Zhang Rongwu Xu Fei Huang Kun Wang Yang Liu Fan Zhang Yongbin Li 59 5 0 17 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 42 13 0 27 Jul 2024
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models Chengzhengxu Li Xiaoming Liu Zhaohan Zhang Yichen Wang Chen Liu Y. Lan Chao Shen 60 2 0 15 Jun 2024
Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics Fangru Lin Daniel Altshuler J. Pierrehumbert 38 1 0 04 Apr 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models Carlo Nicolini Jacopo Staiano Bruno Lepri Raffaele Marino MoE 34 1 0 13 Mar 2024
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis Shashank Kotyan Tatsuya Ueda Danilo Vasconcellos Vargas 32 1 0 07 Dec 2023
On the Importance of Step-wise Embeddings for Heterogeneous Clinical Time-Series Rita Kuznetsova Alizée Pace Manuel Burger Hugo Yèche Gunnar Rätsch AI4TS 39 5 0 15 Nov 2023
AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer Leixin Yang Yu Xiang 31 0 0 22 Sep 2023
Nebula: Self-Attention for Dynamic Malware Analysis Dmitrijs Trizna Christian Scano Battista Biggio Fabio Roli 24 13 0 19 Sep 2023
Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models A. A. Falaki R. Gras ViT 26 7 0 28 Aug 2023
Towards the Visualization of Aggregated Class Activation Maps to Analyse the Global Contribution of Class Features Igor Cherepanov D. Sessler Alex Ulmer Hendrik Lücke-Tieke Jörn Kohlhammer FAtt 24 0 0 29 Jul 2023
Zero-Shot Text Classification via Self-Supervised Tuning Chaoqun Liu Wenxuan Zhang Guizhen Chen Xiaobao Wu A. Luu Chip Hong Chang Lidong Bing VLM 55 11 0 19 May 2023
A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification Yunlong Feng Bohan Li Libo Qin Xiao Xu Wanxiang Che 22 3 0 18 Apr 2023
UKP-SQuARE v3: A Platform for Multi-Agent QA Research Haritz Puerto Tim Baumgärtner Rachneet Sachdeva Haishuo Fang Haotian Zhang Sewin Tariverdian Kexin Wang Iryna Gurevych 30 2 0 31 Mar 2023
Evaluating self-attention interpretability through human-grounded experimental protocol Milan Bhan Nina Achache Victor Legrand A. Blangero Nicolas Chesneau 26 9 0 27 Mar 2023
How Does Attention Work in Vision Transformers? A Visual Analytics Attempt Yiran Li Junpeng Wang Xin Dai Liang Wang Chin-Chia Michael Yeh Yan Zheng Wei Zhang Kwan-Liu Ma ViT 20 24 0 24 Mar 2023
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings Jan Engler Sandipan Sikdar Marlene Lutz M. Strohmaier 32 7 0 11 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference R. Brath Daniel A. Keim Johannes Knittel Shimei Pan Pia Sommerauer Hendrik Strobelt 19 11 0 11 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention Shashanka Venkataramanan Amir Ghodrati Yuki M. Asano Fatih Porikli A. Habibian ViT 23 25 0 05 Jan 2023
Black-box language model explanation by context length probing Ondřej Cífka Antoine Liutkus MILM LRM 24 6 0 30 Dec 2022
PCRED: Zero-shot Relation Triplet Extraction with Potential Candidate Relation Selection and Entity Boundary Detection Yuquan Lan Dongxu Li Yunqi Zhang Hui Zhao Gang Zhao 27 4 0 26 Nov 2022
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT Siyuan Lu Chenchen Zhou Keli Xie Jun Lin Zhongfeng Wang 29 1 0 16 Nov 2022
Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations A. Bhat Ashutosh Modi 35 9 0 07 Nov 2022
MOFormer: Self-Supervised Transformer model for Metal-Organic Framework Property Prediction Zhonglin Cao Rishikesh Magar Yuyang Wang A. Farimani AI4CE 25 88 0 25 Oct 2022
Hierarchical Multi-Interest Co-Network For Coarse-Grained Ranking Xu Yuan Chengjun Xu Qiwei Chen Tao Zhuang Hongjie Chen Chong Li Junfeng Ge AI4TS 27 0 0 19 Oct 2022
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling Kalpa Gunaratna Vijay Srinivasan Akhila Yerukola Hongxia Jin 29 6 0 19 Oct 2022
A Transformer-based deep neural network model for SSVEP classification Jianbo Chen Yangsong Zhang Yudong Pan Peng Xu Cuntai Guan 22 50 0 09 Oct 2022
polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics Christopher Kuenneth R. Ramprasad 39 101 0 29 Sep 2022
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts Timo Spinde Manuel Plank Jan-David Krieger Terry Ruas Bela Gipp Akiko Aizawa 27 69 0 29 Sep 2022
Visual Comparison of Language Model Adaptation Rita Sevastjanova E. Cakmak Shauli Ravfogel Ryan Cotterell Mennatallah El-Assady VLM 49 16 0 17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models Hendrik Strobelt Albert Webson Victor Sanh Benjamin Hoover Johanna Beyer Hanspeter Pfister Alexander M. Rush VLM 36 136 0 16 Aug 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks Tilman Raukur A. Ho Stephen Casper Dylan Hadfield-Menell AAML AI4CE 28 124 0 27 Jul 2022
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Yiwen Shi Jing Wang Ping Ren Taha ValizadehAslani Yi Zhang Meng Hu Hualou Liang AI4MH AAML 24 16 0 25 Jul 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval Wenqiao Zhang Jiannan Guo Meng Li Haochen Shi Shengyu Zhang Juncheng Li Siliang Tang Yueting Zhuang 55 6 0 09 Jul 2022
Astroconformer: Inferring Surface Gravity of Stars from Stellar Light Curves with Transformer Jiashu Pan Y. Ting 丁 Jie Yu 18 3 0 06 Jul 2022
Ask to Know More: Generating Counterfactual Explanations for Fake Claims Shih-Chieh Dai Yi-Li Hsu Aiping Xiong Lun-Wei Ku OffRL 25 22 0 10 Jun 2022
Attention Flows for General Transformers Niklas Metzger Christopher Hahn Julian Siber Frederik Schmitt Bernd Finkbeiner 42 0 0 30 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information Chiyu Feng Po-Chun Hsu Hung-yi Lee SSL 31 8 0 08 May 2022
An Exploratory Study on Code Attention in BERT Rishab Sharma Fuxiang Chen Fatemeh H. Fard David Lo 27 25 0 05 Apr 2022
Interpretation of Black Box NLP Models: A Survey Shivani Choudhary N. Chatterjee S. K. Saha FAtt 34 10 0 31 Mar 2022
Scientometric Review of Artificial Intelligence for Operations & Maintenance of Wind Turbines: The Past, Present and Future Joyjit Chatterjee Nina Dethlefs 31 83 0 30 Mar 2022
Self-supervised Video-centralised Transformer for Video Face Clustering Yujiang Wang Mingzhi Dong Jie Shen Yi-Si Luo Yiming Lin Pingchuan Ma Stavros Petridis M. Pantic ViT 28 3 0 24 Mar 2022
GRS: Combining Generation and Revision in Unsupervised Sentence Simplification Mohammad Dehghan Dhruv Kumar Lukasz Golab 29 12 0 18 Mar 2022
Iteratively Prompt Pre-trained Language Models for Chain of Thought Boshi Wang Xiang Deng Huan Sun KELM ReLM LRM 41 95 0 16 Mar 2022
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark Yunhe Gao Mu Zhou Ding Liu Zhennan Yan Shaoting Zhang Dimitris N. Metaxas ViT MedIm 28 68 0 28 Feb 2022
LISA: Learning Interpretable Skill Abstractions from Language Divyansh Garg Skanda Vaidyanath Kuno Kim Jiaming Song Stefano Ermon LM&Ro OffRL 156 29 0 28 Feb 2022