ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04341
  4. Cited By
What Does BERT Look At? An Analysis of BERT's Attention

What Does BERT Look At? An Analysis of BERT's Attention

11 June 2019
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
    MILM
ArXivPDFHTML

Papers citing "What Does BERT Look At? An Analysis of BERT's Attention"

50 / 885 papers shown
Title
Analyzing the Attention Heads for Pronoun Disambiguation in
  Context-aware Machine Translation Models
Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models
Paweł Mąka
Yusuf Can Semerci
Jan Scholtes
Gerasimos Spanakis
86
0
0
15 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
101
15
0
03 Dec 2024
StructFormer: Document Structure-based Masked Attention and its Impact
  on Language Model Pre-Training
StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training
Kaustubh Ponkshe
Venkatapathy Subramanian
Natwar Modani
Ganesh Ramakrishnan
70
0
0
25 Nov 2024
Latent Space Disentanglement in Diffusion Transformers Enables Precise
  Zero-shot Semantic Editing
Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing
Zitao Shuai
Chenwei Wu
Zhengxu Tang
Bowen Song
Liyue Shen
DiffM
55
0
0
12 Nov 2024
Phase Diagram of Vision Large Language Models Inference: A Perspective from Interaction across Image and Instruction
Phase Diagram of Vision Large Language Models Inference: A Perspective from Interaction across Image and Instruction
Houjing Wei
Hakaze Cho
Yuting Shi
MLLM
38
0
0
01 Nov 2024
Larger models yield better results? Streamlined severity classification
  of ADHD-related concerns using BERT-based knowledge distillation
Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation
Ahmed Akib Jawad Karim
Kazi Hafiz Md. Asad
Md. Golam Rabiul Alam
AI4MH
44
2
0
30 Oct 2024
Abrupt Learning in Transformers: A Case Study on Matrix Completion
Abrupt Learning in Transformers: A Case Study on Matrix Completion
Pulkit Gopalani
Ekdeep Singh Lubana
Wei Hu
45
3
0
29 Oct 2024
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From
  Syntax to Semantics
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics
Isabelle G. Lee
Joshua Lum
Ziyi Liu
Dani Yogatama
LRM
24
0
0
28 Oct 2024
On Explaining with Attention Matrices
On Explaining with Attention Matrices
Omar Naim
Nicholas Asher
29
1
0
24 Oct 2024
From Attention to Activation: Unravelling the Enigmas of Large Language
  Models
From Attention to Activation: Unravelling the Enigmas of Large Language Models
Prannay Kaul
Chengcheng Ma
Ismail Elezi
Jiankang Deng
28
2
0
22 Oct 2024
A Psycholinguistic Evaluation of Language Models' Sensitivity to
  Argument Roles
A Psycholinguistic Evaluation of Language Models' Sensitivity to Argument Roles
Eun-Kyoung Rosa Lee
Sathvik Nair
Naomi Feldman
62
4
0
21 Oct 2024
Feint and Attack: Attention-Based Strategies for Jailbreaking and
  Protecting LLMs
Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs
Rui Pu
Chaozhuo Li
Rui Ha
Zejian Chen
Litian Zhang
Ziqiang Liu
Lirong Qiu
Xi Zhang
AAML
34
1
0
18 Oct 2024
Analyzing Deep Transformer Models for Time Series Forecasting via
  Manifold Learning
Analyzing Deep Transformer Models for Time Series Forecasting via Manifold Learning
Ilya Kaufman
Omri Azencot
AI4TS
31
2
0
17 Oct 2024
On the Role of Attention Heads in Large Language Model Safety
On the Role of Attention Heads in Large Language Model Safety
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Fan Zhang
Yongbin Li
59
5
0
17 Oct 2024
Linguistically Grounded Analysis of Language Models using Shapley Head Values
Linguistically Grounded Analysis of Language Models using Shapley Head Values
Marcell Richard Fekete
Johannes Bjerva
31
0
0
17 Oct 2024
PromptExp: Multi-granularity Prompt Explanation of Large Language Models
PromptExp: Multi-granularity Prompt Explanation of Large Language Models
Ximing Dong
Shaowei Wang
Dayi Lin
Gopi Krishnan Rajbahadur
Boquan Zhou
Shichao Liu
Ahmed E. Hassan
AAML
LRM
30
1
0
16 Oct 2024
Neuron-based Personality Trait Induction in Large Language Models
Neuron-based Personality Trait Induction in Large Language Models
Jia Deng
Tianyi Tang
Yanbin Yin
Wenhao Yang
Wayne Xin Zhao
Ji-Rong Wen
38
1
0
16 Oct 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ZhongXiang Sun
Xiaoxue Zang
Kai Zheng
Yang Song
Jun Xu
Xiao Zhang
Weijie Yu
Yang Song
Han Li
57
7
0
15 Oct 2024
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Shuo Li
Tao Ji
Xiaoran Fan
Linsheng Lu
L. Yang
...
Yixuan Wang
Xiaohui Zhao
Tao Gui
Qi Zhang
Xuanjing Huang
42
0
0
15 Oct 2024
TemporalBench: Benchmarking Fine-grained Temporal Understanding for
  Multimodal Video Models
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Mu Cai
Reuben Tan
Jianrui Zhang
Bocheng Zou
Kai Zhang
...
Yao Dou
J. Park
Jianfeng Gao
Yong Jae Lee
Jianwei Yang
44
12
0
14 Oct 2024
Inference and Verbalization Functions During In-Context Learning
Inference and Verbalization Functions During In-Context Learning
Junyi Tao
Xiaoyin Chen
Nelson F. Liu
LRM
ReLM
26
0
0
12 Oct 2024
Robust AI-Generated Text Detection by Restricted Embeddings
Robust AI-Generated Text Detection by Restricted Embeddings
Kristian Kuznetsov
Eduard Tulchinskii
Laida Kushnareva
German Magai
Serguei Barannikov
Sergey I. Nikolenko
Irina Piontkovskaya
DeLMO
38
3
0
10 Oct 2024
Neuropsychology of AI: Relationship Between Activation Proximity and
  Categorical Proximity Within Neural Categories of Synthetic Cognition
Neuropsychology of AI: Relationship Between Activation Proximity and Categorical Proximity Within Neural Categories of Synthetic Cognition
Michael Pichat
Enola Campoli
William Pogrund
Jourdan Wilson
Michael Veillet-Guillem
Anton Melkozerov
Paloma Pichat
Armanouche Gasparian
Samuel Demarchi
Judicael Poumay
NAI
53
3
0
08 Oct 2024
Mechanistic?
Mechanistic?
Naomi Saphra
Sarah Wiegreffe
AI4CE
29
9
0
07 Oct 2024
Explanation sensitivity to the randomness of large language models: the
  case of journalistic text classification
Explanation sensitivity to the randomness of large language models: the case of journalistic text classification
Jérémie Bogaert
Marie-Catherine de Marneffe
Antonin Descampe
Louis Escouflaire
Cedrick Fairon
François-Xavier Standaert
24
1
0
07 Oct 2024
Understanding Reasoning in Chain-of-Thought from the Hopfieldian View
Understanding Reasoning in Chain-of-Thought from the Hopfieldian View
Lijie Hu
Liang Liu
Shu Yang
Xin Chen
Zhen Tan
Muhammad Asif Ali
Mengdi Li
Di Wang
LRM
46
1
0
04 Oct 2024
How Language Models Prioritize Contextual Grammatical Cues?
How Language Models Prioritize Contextual Grammatical Cues?
Hamidreza Amirzadeh
A. Alishahi
Hosein Mohebbi
23
0
0
04 Oct 2024
Self-Powered LLM Modality Expansion for Large Speech-Text Models
Self-Powered LLM Modality Expansion for Large Speech-Text Models
Tengfei Yu
Xuebo Liu
Zhiyi Hou
Liang Ding
Dacheng Tao
Min Zhang
32
0
0
04 Oct 2024
Differentiation and Specialization of Attention Heads via the Refined
  Local Learning Coefficient
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
George Wang
Jesse Hoogland
Stan van Wingerden
Zach Furman
Daniel Murfet
OffRL
34
7
0
03 Oct 2024
Racing Thoughts: Explaining Contextualization Errors in Large Language Models
Racing Thoughts: Explaining Contextualization Errors in Large Language Models
Michael A. Lepori
Michael Mozer
Asma Ghandeharioun
LRM
85
1
0
02 Oct 2024
Attention layers provably solve single-location regression
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
140
2
0
02 Oct 2024
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language
  Models
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
Keivan Alizadeh
Iman Mirzadeh
Hooman Shahrokhi
Dmitry Belenko
Frank Sun
Minsik Cho
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
MoE
31
1
0
01 Oct 2024
Enhancing elusive clues in knowledge learning by contrasting attention of language models
Enhancing elusive clues in knowledge learning by contrasting attention of language models
Jian Gao
Xiao Zhang
Ji Wu
Miao Li
43
0
0
26 Sep 2024
Decoding Large-Language Models: A Systematic Overview of Socio-Technical
  Impacts, Constraints, and Emerging Questions
Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions
Zeyneb N. Kaya
Souvick Ghosh
42
0
0
25 Sep 2024
Unveiling Language Competence Neurons: A Psycholinguistic Approach to
  Model Interpretability
Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability
Xufeng Duan
Xinyu Zhou
Bei Xiao
Zhenguang G. Cai
MILM
35
3
0
24 Sep 2024
Supervised Fine-Tuning Achieve Rapid Task Adaption Via Alternating
  Attention Head Activation Patterns
Supervised Fine-Tuning Achieve Rapid Task Adaption Via Alternating Attention Head Activation Patterns
Yang Zhao
Li Du
Xiao Ding
Kai Xiong
Ting Liu
Bing Qin
23
2
0
24 Sep 2024
Investigating Layer Importance in Large Language Models
Investigating Layer Importance in Large Language Models
Yang Zhang
Yanfei Dong
Kenji Kawaguchi
FAtt
54
6
0
22 Sep 2024
Probing Context Localization of Polysemous Words in Pre-trained Language
  Model Sub-Layers
Probing Context Localization of Polysemous Words in Pre-trained Language Model Sub-Layers
Soniya Vijayakumar
Josef van Genabith
Simon Ostermann
21
0
0
21 Sep 2024
Localized Gaussians as Self-Attention Weights for Point Clouds
  Correspondence
Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence
Alessandro Riva
Alessandro Raganato
Simone Melzi
3DPC
36
0
0
20 Sep 2024
Attention-Seeker: Dynamic Self-Attention Scoring for Unsupervised
  Keyphrase Extraction
Attention-Seeker: Dynamic Self-Attention Scoring for Unsupervised Keyphrase Extraction
Erwin D. López Z.
Cheng Tang
Atsushi Shimada
21
1
0
17 Sep 2024
Pooling And Attention: What Are Effective Designs For LLM-Based
  Embedding Models?
Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?
Yixuan Tang
Yi Yang
33
3
0
04 Sep 2024
Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot
  Fine-grained Semantic Editing
Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot Fine-grained Semantic Editing
Zitao Shuai
Chenwei Wu
Zhengxu Tang
Bowen Song
Liyue Shen
33
0
0
23 Aug 2024
Multilevel Interpretability Of Artificial Neural Networks: Leveraging
  Framework And Methods From Neuroscience
Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience
Zhonghao He
Jascha Achterberg
Katie Collins
Kevin K. Nejad
Danyal Akarca
...
Chole Li
Kai J. Sandbrink
Stephen Casper
Anna Ivanova
Grace W. Lindsay
AI4CE
28
1
0
22 Aug 2024
The Self-Contained Negation Test Set
The Self-Contained Negation Test Set
David Kletz
Pascal Amsili
Marie Candito
16
1
0
21 Aug 2024
Reading with Intent
Reading with Intent
Benjamin Z. Reichman
Kartik Talamadupula
Toshish Jawale
Larry Heck
RALM
37
0
0
20 Aug 2024
Training an NLP Scholar at a Small Liberal Arts College: A Backwards
  Designed Course Proposal
Training an NLP Scholar at a Small Liberal Arts College: A Backwards Designed Course Proposal
Grusha Prasad
Forrest Davis
26
0
0
11 Aug 2024
Analysis of Argument Structure Constructions in the Large Language Model
  BERT
Analysis of Argument Structure Constructions in the Large Language Model BERT
Pegah Ramezani
Achim Schilling
Patrick Krauss
39
1
0
08 Aug 2024
Finch: Prompt-guided Key-Value Cache Compression
Finch: Prompt-guided Key-Value Cache Compression
Giulio Corallo
Paolo Papotti
38
3
0
31 Jul 2024
Tracking linguistic information in transformer-based sentence embeddings
  through targeted sparsification
Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification
Vivi Nastase
Paola Merlo
38
2
0
25 Jul 2024
Efficient LLM Training and Serving with Heterogeneous Context Sharding
  among Attention Heads
Efficient LLM Training and Serving with Heterogeneous Context Sharding among Attention Heads
Xihui Lin
Yunan Zhang
Suyu Ge
Barun Patra
Vishrav Chaudhary
Hao Peng
Xia Song
35
0
0
25 Jul 2024
Previous
12345...161718
Next