Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.11895
Cited By
In-context Learning and Induction Heads
24 September 2022
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
T. Henighan
Benjamin Mann
Amanda Askell
Yuntao Bai
Anna Chen
Tom Conerly
Dawn Drain
Deep Ganguli
Zac Hatfield-Dodds
Danny Hernandez
Scott R. Johnston
Andy Jones
John Kernion
Liane Lovitt
Kamal Ndousse
Dario Amodei
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"In-context Learning and Induction Heads"
50 / 434 papers shown
Title
Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT
D. Hazineh
Zechen Zhang
Jeffery Chiu
56
9
0
11 Oct 2023
Humans and language models diverge when predicting repeating text
Aditya R. Vaidya
Javier S. Turek
Alexander G. Huth
69
6
0
10 Oct 2023
Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition
Zhongtian Chen
Edmund Lau
Jake Mendel
Susan Wei
Daniel Murfet
61
15
0
10 Oct 2023
The Importance of Prompt Tuning for Automated Neuron Explanations
Justin Lee
Tuomas P. Oikarinen
Arjun Chatha
Keng-Chi Chang
Yilan Chen
Tsui-Wei Weng
LRM
68
8
0
09 Oct 2023
Towards Better Chain-of-Thought Prompting Strategies: A Survey
Zihan Yu
Liang He
Zhen Wu
Xinyu Dai
Jiajun Chen
LRM
167
55
0
08 Oct 2023
Uncovering hidden geometry in Transformers via disentangling position and context
Jiajun Song
Yiqiao Zhong
80
10
0
07 Oct 2023
Copy Suppression: Comprehensively Understanding an Attention Head
Callum McDougall
Arthur Conmy
Cody Rushing
Thomas McGrath
Neel Nanda
MILM
73
46
0
06 Oct 2023
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
Deniz Bayazit
Negar Foroutan
Zeming Chen
Gail Weiss
Antoine Bosselut
KELM
105
16
0
04 Oct 2023
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
S. Bhattamishra
Arkil Patel
Phil Blunsom
Varun Kanade
97
56
0
04 Oct 2023
DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models
Albert Garde
Esben Kran
Fazl Barez
85
2
0
03 Oct 2023
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
Praneeth Kacham
Vahab Mirrokni
Peilin Zhong
95
14
0
02 Oct 2023
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Yuandong Tian
Yiping Wang
Zhenyu Zhang
Beidi Chen
Simon Shaolei Du
82
41
0
01 Oct 2023
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
Xuansheng Wu
Wenlin Yao
Jianshu Chen
Xiaoman Pan
Xiaoyang Wang
Ninghao Liu
Dong Yu
LRM
85
33
0
30 Sep 2023
Understanding In-Context Learning from Repetitions
Jianhao Yan
Jin Xu
Chiyu Song
Chenming Wu
Yafu Li
Yue Zhang
112
24
0
30 Sep 2023
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Zhihan Liu
Hao Hu
Shenao Zhang
Hongyi Guo
Shuqi Ke
Boyi Liu
Zhaoran Wang
LLMAG
LRM
143
41
0
29 Sep 2023
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang
Neel Nanda
LLMSV
205
115
0
27 Sep 2023
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Mert Yuksekgonul
Varun Chandrasekaran
Erik Jones
Suriya Gunasekar
Ranjita Naik
Hamid Palangi
Ece Kamar
Besmira Nushi
HILM
60
49
0
26 Sep 2023
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
Yonggan Fu
Yongan Zhang
Zhongzhi Yu
Sixu Li
Zhifan Ye
Chaojian Li
Cheng Wan
Ying Lin
108
69
0
19 Sep 2023
Rigorously Assessing Natural Language Explanations of Neurons
Jing-ling Huang
Atticus Geiger
Karel DÓosterlinck
Zhengxuan Wu
Christopher Potts
MILM
77
29
0
19 Sep 2023
Breaking through the learning plateaus of in-context learning in Transformer
Jingwen Fu
Tao Yang
Yuwang Wang
Yan Lu
Nanning Zheng
71
3
0
12 Sep 2023
Uncovering mesa-optimization algorithms in Transformers
J. Oswald
Eyvind Niklasson
Maximilian Schlegel
Seijin Kobayashi
Nicolas Zucchet
...
Mark Sandler
Blaise Agüera y Arcas
Max Vladymyrov
Razvan Pascanu
João Sacramento
72
64
0
11 Sep 2023
Explaining grokking through circuit efficiency
Vikrant Varma
Rohin Shah
Zachary Kenton
János Kramár
Ramana Kumar
94
55
0
05 Sep 2023
Gated recurrent neural networks discover attention
Nicolas Zucchet
Seijin Kobayashi
Yassir Akram
J. Oswald
Maxime Larcher
Angelika Steger
João Sacramento
86
8
0
04 Sep 2023
NeuroSurgeon: A Toolkit for Subnetwork Analysis
Michael A. Lepori
Ellie Pavlick
Thomas Serre
83
7
0
01 Sep 2023
Emergence of Segmentation with Minimalistic White-Box Transformers
Yaodong Yu
Tianzhe Chu
Shengbang Tong
Ziyang Wu
Druv Pai
Sam Buchanan
Yi Ma
ViT
50
22
0
30 Aug 2023
Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability
Tyler A. Chang
Zhuowen Tu
Benjamin Bergen
59
13
0
29 Aug 2023
Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP
Vedant Palit
Rohan Pandey
Aryaman Arora
Paul Pu Liang
86
23
0
27 Aug 2023
Artificial Intelligence and Aesthetic Judgment
Jessica Hullman
Ari Holtzman
Andrew Gelman
41
3
0
21 Aug 2023
Latent State Models of Training Dynamics
Michael Y. Hu
Angelica Chen
Naomi Saphra
Kyunghyun Cho
113
8
0
18 Aug 2023
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Ziyu Zhuang
Qiguang Chen
Longxuan Ma
Mingda Li
Yi Han
Yushan Qian
Haopeng Bai
Zixian Feng
Weinan Zhang
Ting Liu
ELM
80
13
0
15 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
92
15
0
31 Jul 2023
When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities
Jin Chen
Zheng Liu
Xunpeng Huang
Chenwang Wu
Qi Liu
...
Yuxuan Lei
Xiaolong Chen
Xingmei Wang
Defu Lian
Enhong Chen
ALM
92
129
0
31 Jul 2023
The Hydra Effect: Emergent Self-repair in Language Model Computations
Tom McGrath
Matthew Rahtz
János Kramár
Vladimir Mikulik
Shane Legg
MILM
LRM
72
73
0
28 Jul 2023
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Hengyu Fu
Tianyu Guo
Yu Bai
Song Mei
MLT
102
26
0
21 Jul 2023
Towards A Unified Agent with Foundation Models
Norman Di Palo
Arunkumar Byravan
Leonard Hasenclever
Markus Wulfmeier
N. Heess
Martin Riedmiller
LM&Ro
LLMAG
OffRL
83
60
0
18 Jul 2023
The semantic landscape paradigm for neural networks
Shreyas Gokhale
94
2
0
18 Jul 2023
Overthinking the Truth: Understanding how Language Models Process False Demonstrations
Danny Halawi
Jean-Stanislas Denain
Jacob Steinhardt
92
59
0
18 Jul 2023
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Tom Lieberum
Matthew Rahtz
János Kramár
Neel Nanda
G. Irving
Rohin Shah
Vladimir Mikulik
103
115
0
18 Jul 2023
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks
Yi-Syuan Chen
Yun-Zhu Song
Cheng Yu Yeo
Bei Liu
Jianlong Fu
Hong-Han Shuai
VLM
LRM
92
4
0
15 Jul 2023
Large Language Models
Michael R Douglas
LLMAG
LM&MA
177
645
0
11 Jul 2023
Frontier AI Regulation: Managing Emerging Risks to Public Safety
Markus Anderljung
Joslyn Barnhart
Anton Korinek
Jade Leung
Cullen O'Keefe
...
Jonas Schuett
Yonadav Shavit
Divya Siddarth
Robert F. Trager
Kevin J. Wolf
SILM
145
125
0
06 Jul 2023
Trainable Transformer in Transformer
A. Panigrahi
Sadhika Malladi
Mengzhou Xia
Sanjeev Arora
VLM
118
13
0
03 Jul 2023
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks
Ziqian Zhong
Ziming Liu
Max Tegmark
Jacob Andreas
95
102
0
30 Jun 2023
Understanding In-Context Learning via Supportive Pretraining Data
Xiaochuang Han
Daniel Simig
Todor Mihaylov
Yulia Tsvetkov
Asli Celikyilmaz
Tianlu Wang
AIMat
113
38
0
26 Jun 2023
Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression
Allan Raventós
Mansheej Paul
F. Chen
Surya Ganguli
127
87
0
26 Jun 2023
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Jonathan Lee
Annie Xie
Aldo Pacchiano
Yash Chandak
Chelsea Finn
Ofir Nachum
Emma Brunskill
OffRL
118
86
0
26 Jun 2023
Large Sequence Models for Sequential Decision-Making: A Survey
Muning Wen
Runji Lin
Hanjing Wang
Yaodong Yang
Ying Wen
Kai Zou
Jun Wang
Haifeng Zhang
Weinan Zhang
LM&Ro
LRM
98
36
0
24 Jun 2023
Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities
Xudong Shen
H. Brown
Jiashu Tao
Martin Strobel
Yao Tong
Akshay Narayan
Harold Soh
Finale Doshi-Velez
98
3
0
22 Jun 2023
An Overview of Catastrophic AI Risks
Dan Hendrycks
Mantas Mazeika
Thomas Woodside
SILM
82
186
0
21 Jun 2023
SynerGPT: In-Context Learning for Personalized Drug Synergy Prediction and Drug Design
Carl Edwards
Aakanksha Naik
Tushar Khot
Martin D. Burke
Heng Ji
Tom Hope
130
16
0
19 Jun 2023
Previous
1
2
3
4
5
6
7
8
9
Next