Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.03752
Cited By
Attention Heads of Large Language Models: A Survey
5 September 2024
Zifan Zheng
Yezhaohui Wang
Yuxin Huang
Shichao Song
Mingchuan Yang
Bo Tang
Feiyu Xiong
Zhiyu Li
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Heads of Large Language Models: A Survey"
43 / 43 papers shown
Title
Enhancing Semantic Consistency of Large Language Models through Model Editing: An Interpretability-Oriented Approach
J. Yang
Dapeng Chen
Yajing Sun
Rongjun Li
Zhiyong Feng
Wei Peng
91
7
0
19 Jan 2025
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
Yanwen Huang
Yong Zhang
Ning Cheng
Zhitao Li
Shaojun Wang
Jing Xiao
127
0
0
02 Jan 2025
On the Role of Attention Heads in Large Language Model Safety
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Sihang Li
Yongbin Li
104
9
0
17 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
87
16
0
10 Oct 2024
Controllable Text Generation for Large Language Models: A Survey
Xun Liang
Hanyu Wang
Yezhaohui Wang
Shichao Song
Jiawei Yang
...
Jie Hu
Dan Liu
Shunyu Yao
Feiyu Xiong
Zhiyu Li
46
20
0
22 Aug 2024
A Mechanistic Interpretation of Syllogistic Reasoning in Auto-Regressive Language Models
Geonhee Kim
Marco Valentino
André Freitas
LRM
AI4CE
80
9
0
16 Aug 2024
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment
Sangwon Yu
Jongyoon Song
Bongkyu Hwang
Hoyoung Kang
Sooah Cho
Junhwa Choi
Seongho Joe
Taehee Lee
Youngjune Gwon
Sungroh Yoon
158
6
0
31 Jul 2024
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps
Yung-Sung Chuang
Linlu Qiu
Cheng-Yu Hsieh
Ranjay Krishna
Yoon Kim
James R. Glass
HILM
54
46
0
09 Jul 2024
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression
Tianyu Fu
Haofeng Huang
Xuefei Ning
Genghan Zhang
Boju Chen
...
Shiyao Li
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
83
20
0
21 Jun 2024
Knowledge Circuits in Pretrained Transformers
Yunzhi Yao
Ningyu Zhang
Zekun Xi
Meng Wang
Ziwen Xu
Shumin Deng
Huajun Chen
KELM
107
23
0
28 May 2024
How does GPT-2 Predict Acronyms? Extracting and Understanding a Circuit via Mechanistic Interpretability
Jorge García-Carrasco
Alejandro Maté
Juan Trujillo
46
10
0
07 May 2024
How to use and interpret activation patching
Stefan Heimersheim
Neel Nanda
52
45
0
23 Apr 2024
Summing Up the Facts: Additive Mechanisms Behind Factual Recall in LLMs
Bilal Chughtai
Alan Cooney
Neel Nanda
HILM
KELM
55
20
0
11 Feb 2024
How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning
Zeping Yu
Sophia Ananiadou
68
12
0
05 Feb 2024
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain
Gavin Mischler
Yinghao Aaron Li
Stephan Bickel
A. Mehta
N. Mesgarani
59
28
0
31 Jan 2024
Successor Heads: Recurring, Interpretable Attention Heads In The Wild
Rhys Gould
Euan Ong
George Ogden
Arthur Conmy
LRM
31
49
0
14 Dec 2023
The mechanistic basis of data dependence and abrupt learning in an in-context classification task
Gautam Reddy
73
62
0
03 Dec 2023
Aspects of human memory and Large Language Models
R. Janik
25
11
0
07 Nov 2023
Function Vectors in Large Language Models
Eric Todd
Millicent Li
Arnab Sen Sharma
Aaron Mueller
Byron C. Wallace
David Bau
48
114
0
23 Oct 2023
Circuit Component Reuse Across Tasks in Transformer Language Models
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
70
70
0
12 Oct 2023
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang
Neel Nanda
LLMSV
178
108
0
27 Sep 2023
Linearity of Relation Decoding in Transformer Language Models
Evan Hernandez
Arnab Sen Sharma
Tal Haklay
Kevin Meng
Martin Wattenberg
Jacob Andreas
Yonatan Belinkov
David Bau
KELM
59
98
0
17 Aug 2023
Birth of a Transformer: A Memory Viewpoint
A. Bietti
Vivien A. Cabannes
Diane Bouchacourt
Hervé Jégou
Léon Bottou
72
94
0
01 Jun 2023
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
305
514
0
24 Sep 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Tilman Raukur
A. Ho
Stephen Casper
Dylan Hadfield-Menell
AAML
AI4CE
77
132
0
27 Jul 2022
Language models show human-like content effects on reasoning tasks
Ishita Dasgupta
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Hannah R. Sheahan
Antonia Creswell
D. Kumaran
James L. McClelland
Felix Hill
ReLM
LRM
109
186
0
14 Jul 2022
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
43
318
0
27 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
742
9,330
0
28 Jan 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
227
4,392
0
27 Oct 2021
Predictive Coding: a Theoretical and Experimental Review
Beren Millidge
A. Seth
Christopher L. Buckley
AI4CE
42
128
0
27 Jul 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
235
2,440
0
20 Apr 2021
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELM
MU
79
451
0
18 Apr 2021
Transformer Feed-Forward Layers Are Key-Value Memories
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
130
827
0
29 Dec 2020
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
119
988
0
12 Feb 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
545
4,797
0
23 Jan 2020
Adaptively Sparse Transformers
Gonçalo M. Correia
Vlad Niculae
André F. T. Martins
80
255
0
30 Aug 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
106
1,138
0
23 May 2019
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
124
657
0
05 Apr 2019
BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model
Alex Jinpeng Wang
Kyunghyun Cho
VLM
64
356
0
11 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,729
0
11 Oct 2018
Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
Shashi Narayan
Shay B. Cohen
Mirella Lapata
AILaw
119
1,674
0
27 Aug 2018
Methods for Interpreting and Understanding Deep Neural Networks
G. Montavon
Wojciech Samek
K. Müller
FaML
278
2,262
0
24 Jun 2017
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
195
2,643
0
09 May 2017
1