Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.11895
Cited By
In-context Learning and Induction Heads
24 September 2022
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
T. Henighan
Benjamin Mann
Amanda Askell
Yuntao Bai
Anna Chen
Tom Conerly
Dawn Drain
Deep Ganguli
Zac Hatfield-Dodds
Danny Hernandez
Scott R. Johnston
Andy Jones
John Kernion
Liane Lovitt
Kamal Ndousse
Dario Amodei
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"In-context Learning and Induction Heads"
50 / 434 papers shown
Title
Can Language Models Explain Their Own Classification Behavior?
Dane Sherburn
Bilal Chughtai
Owain Evans
64
1
0
13 May 2024
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CE
FaML
SSL
OOD
88
9
0
09 May 2024
Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics
Hanlin Zhu
Baihe Huang
Shaolun Zhang
Michael I. Jordan
Jiantao Jiao
Yuandong Tian
Stuart Russell
LRM
AI4CE
106
18
0
07 May 2024
Philosophy of Cognitive Science in the Age of Deep Learning
Raphaël Millière
AI4CE
NAI
76
3
0
07 May 2024
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
124
15
0
06 May 2024
Better & Faster Large Language Models via Multi-token Prediction
Fabian Gloeckle
Badr Youbi Idrissi
Baptiste Rozière
David Lopez-Paz
Gabriele Synnaeve
114
121
0
30 Apr 2024
KAN: Kolmogorov-Arnold Networks
Ziming Liu
Yixuan Wang
Sachin Vaidya
Fabian Ruehle
James Halverson
Marin Soljacic
Thomas Y. Hou
Max Tegmark
318
591
0
30 Apr 2024
Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs
Valeriia Cherepanova
James Zou
AAML
102
6
0
26 Apr 2024
Retrieval Head Mechanistically Explains Long-Context Factuality
Wenhao Wu
Yizhong Wang
Guangxuan Xiao
Hao-Chun Peng
Yao Fu
LRM
105
84
0
24 Apr 2024
Transformers Can Represent
n
n
n
-gram Language Models
Anej Svete
Ryan Cotterell
73
20
0
23 Apr 2024
FMint: Bridging Human Designed and Data Pretrained Models for Differential Equation Foundation Model
Zezheng Song
Jiaxin Yuan
Haizhao Yang
AI4CE
109
18
0
23 Apr 2024
SnapKV: LLM Knows What You are Looking for Before Generation
Yuhong Li
Yingbing Huang
Bowen Yang
Bharat Venkitesh
Acyr Locatelli
Hanchen Ye
Tianle Cai
Patrick Lewis
Deming Chen
VLM
143
210
0
22 Apr 2024
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
135
158
0
22 Apr 2024
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda
Johannes Schneider
131
35
0
15 Apr 2024
Evidence from counterfactual tasks supports emergent analogical reasoning in large language models
Taylor Webb
K. Holyoak
Hongjing Lu
LRM
ELM
102
6
0
14 Apr 2024
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
Igor Tufanov
Karen Hambardzumyan
Javier Ferrando
Elena Voita
KELM
99
8
0
10 Apr 2024
How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes
Harmon Bhasin
Timothy Ossowski
Yiqiao Zhong
Junjie Hu
40
0
0
04 Apr 2024
Task Agnostic Architecture for Algorithm Induction via Implicit Composition
Sahil J. Sindhi
Ignas Budvytis
82
0
0
03 Apr 2024
Generative Retrieval as Multi-Vector Dense Retrieval
Shiguang Wu
Wenda Wei
Mengqi Zhang
Zhumin Chen
Jun Ma
Zhaochun Ren
Maarten de Rijke
Pengjie Ren
3DV
79
7
0
31 Mar 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
122
227
0
28 Mar 2024
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics
Norman Di Palo
Edward Johns
115
37
0
28 Mar 2024
Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models
Ang Lv
Yuhan Chen
Kaiyi Zhang
Yulong Wang
Lifeng Liu
Ji-Rong Wen
Jian Xie
Rui Yan
KELM
76
18
0
28 Mar 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks
Can Rager
Eric J. Michaud
Yonatan Belinkov
David Bau
Aaron Mueller
173
159
0
28 Mar 2024
Mechanistic Design and Scaling of Hybrid Architectures
Michael Poli
Armin W. Thomas
Eric N. D. Nguyen
Pragaash Ponnusamy
Bjorn Deiseroth
...
Brian Hie
Stefano Ermon
Christopher Ré
Ce Zhang
Stefano Massaroli
MoE
116
29
0
26 Mar 2024
Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms
Michael Hanna
Sandro Pezzelle
Yonatan Belinkov
93
43
0
26 Mar 2024
Learning Useful Representations of Recurrent Neural Network Weight Matrices
Vincent Herrmann
Francesco Faccio
Jürgen Schmidhuber
70
7
0
18 Mar 2024
On the low-shot transferability of [V]-Mamba
Diganta Misra
Jay Gala
Antonio Orvieto
Mamba
112
1
0
15 Mar 2024
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
115
81
0
11 Mar 2024
Fantastic Semantics and Where to Find Them: Investigating Which Layers of Generative LLMs Reflect Lexical Semantics
Zhu Liu
Cunliang Kong
Ying Liu
Maosong Sun
71
19
0
03 Mar 2024
Learning Associative Memories with Gradient Descent
Vivien A. Cabannes
Berfin Simsek
A. Bietti
88
8
0
28 Feb 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
116
6
0
28 Feb 2024
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
126
81
0
27 Feb 2024
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
Xinran Zhao
Hongming Zhang
Xiaoman Pan
Wenlin Yao
Dong Yu
Tongshuang Wu
Jianshu Chen
HILM
LRM
71
7
0
27 Feb 2024
Explorations of Self-Repair in Language Models
Cody Rushing
Neel Nanda
KELM
MILM
LRM
67
13
0
23 Feb 2024
Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions
Clement Neo
Shay B. Cohen
Fazl Barez
76
5
0
23 Feb 2024
Linear Transformers are Versatile In-Context Learners
Max Vladymyrov
J. Oswald
Mark Sandler
Rong Ge
82
18
0
21 Feb 2024
Do Efficient Transformers Really Save Computation?
Kai-Bo Yang
Jan Ackermann
Zhenyu He
Guhao Feng
Bohang Zhang
Yunzhen Feng
Qiwei Ye
Di He
Liwei Wang
102
19
0
21 Feb 2024
Identifying Semantic Induction Heads to Understand In-Context Learning
Jie Ren
Qipeng Guo
Hang Yan
Dongrui Liu
Xipeng Qiu
Dahua Lin
83
29
0
20 Feb 2024
Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT
Zhengfu He
Xuyang Ge
Qiong Tang
Tianxiang Sun
Qinyuan Cheng
Xipeng Qiu
94
22
0
19 Feb 2024
Prospector Heads: Generalized Feature Attribution for Large Models & Data
Gautam Machiraju
Alexander Derry
Arjun D Desai
Neel Guha
Amir-Hossein Karimi
James Zou
Russ Altman
Christopher Ré
Parag Mallick
AI4TS
MedIm
121
0
0
18 Feb 2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Benjamin L. Edelman
Ezra Edelman
Surbhi Goel
Eran Malach
Nikolaos Tsilivis
BDL
91
56
0
16 Feb 2024
Towards Uncovering How Large Language Model Works: An Explainability Perspective
Haiyan Zhao
Fan Yang
Bo Shen
Himabindu Lakkaraju
Jundong Li
91
13
0
16 Feb 2024
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Yaroslav Aksenov
Nikita Balagansky
Sofia Maria Lo Cicero Vaina
Boris Shaposhnikov
Alexey Gorbatovski
Daniil Gavrilov
KELM
79
5
0
16 Feb 2024
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
118
43
0
14 Feb 2024
Spectral Filters, Dark Signals, and Attention Sinks
Nicola Cancedda
115
18
0
14 Feb 2024
Summing Up the Facts: Additive Mechanisms Behind Factual Recall in LLMs
Bilal Chughtai
Alan Cooney
Neel Nanda
HILM
KELM
72
20
0
11 Feb 2024
The Reasons that Agents Act: Intention and Instrumental Goals
Francis Rhys Ward
Matt MacDermott
Francesco Belardinelli
Francesca Toni
Tom Everitt
AI4CE
81
13
0
11 Feb 2024
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie
Guy Gur-Ari
Zohar Ringel
70
1
0
07 Feb 2024
Opening the AI black box: program synthesis via mechanistic interpretability
Eric J. Michaud
Isaac Liao
Vedang Lad
Ziming Liu
Anish Mudide
Chloe Loughridge
Zifan Carl Guo
Tara Rezaei Kheirkhah
Mateja Vukelić
Max Tegmark
88
13
0
07 Feb 2024
A Resource Model For Neural Scaling Law
Jinyeop Song
Ziming Liu
Max Tegmark
Jeff Gore
158
4
0
07 Feb 2024
Previous
1
2
3
4
5
6
7
8
9
Next