Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.03002
Cited By
The mechanistic basis of data dependence and abrupt learning in an in-context classification task
3 December 2023
Gautam Reddy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The mechanistic basis of data dependence and abrupt learning in an in-context classification task"
45 / 45 papers shown
Title
Context parroting: A simple but tough-to-beat baseline for foundation models in scientific machine learning
Yuanzhao Zhang
William Gilpin
AI4TS
12
0
0
16 May 2025
Understanding In-context Learning of Addition via Activation Subspaces
Xinyan Hu
Kayo Yin
Michael I. Jordan
Jacob Steinhardt
Lijie Chen
53
0
0
08 May 2025
ICL CIPHERS: Quantifying "Learning'' in In-Context Learning via Substitution Ciphers
Zhouxiang Fang
Aayush Mishra
Muhan Gao
Anqi Liu
Daniel Khashabi
44
0
0
28 Apr 2025
In-Context Learning can distort the relationship between sequence likelihoods and biological fitness
Pranav Kantroo
Günter P. Wagner
Benjamin B. Machta
45
0
0
23 Apr 2025
How do language models learn facts? Dynamics, curricula and hallucinations
Nicolas Zucchet
J. Bornschein
Stephanie C. Y. Chan
Andrew Kyle Lampinen
Razvan Pascanu
Soham De
KELM
HILM
LRM
77
3
1
27 Mar 2025
Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding
Shunqi Mao
Chaoyi Zhang
Weidong Cai
MLLM
149
0
0
13 Mar 2025
Strategy Coopetition Explains the Emergence and Transience of In-Context Learning
Aaditya K. Singh
Ted Moskovitz
Sara Dragutinovic
Felix Hill
Stephanie C. Y. Chan
Andrew Saxe
145
0
0
07 Mar 2025
Training Dynamics of In-Context Learning in Linear Attention
Yedi Zhang
Aaditya K. Singh
Peter E. Latham
Andrew Saxe
MLT
64
1
0
28 Jan 2025
What Matters for In-Context Learning: A Balancing Act of Look-up and In-Weight Learning
Jelena Bratulić
Sudhanshu Mittal
Christian Rupprecht
Thomas Brox
41
1
0
09 Jan 2025
A ghost mechanism: An analytical model of abrupt learning
Fatih Dinc
Ege Cirakman
Yiqi Jiang
Mert Yuksekgonul
Mark J. Schnitzer
Hidenori Tanaka
33
2
0
04 Jan 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
85
4
0
31 Dec 2024
Quantifying artificial intelligence through algebraic generalization
Takuya Ito
Murray Campbell
L. Horesh
Tim Klinger
Parikshit Ram
ELM
48
0
0
08 Nov 2024
Toward Understanding In-context vs. In-weight Learning
Bryan Chan
Xinyi Chen
András Gyorgy
Dale Schuurmans
72
4
0
30 Oct 2024
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo
Druv Pai
Yu Bai
Jiantao Jiao
Michael I. Jordan
Song Mei
29
9
0
17 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
96
19
0
15 Oct 2024
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition
Zheyang Xiong
Ziyang Cai
John Cooper
Albert Ge
Vasilis Papageorgiou
...
Saurabh Agarwal
Grigorios G Chrysos
Samet Oymak
Kangwook Lee
Dimitris Papailiopoulos
LRM
35
1
0
08 Oct 2024
Transformers learn variable-order Markov chains in-context
Ruida Zhou
C. Tian
Suhas Diggavi
26
0
0
07 Oct 2024
Deeper Insights Without Updates: The Power of In-Context Learning Over Fine-Tuning
Qingyu Yin
Xuzheng He
Luoao Deng
Chak Tou Leong
Fan Wang
Yanzhao Yan
Xiaoyu Shen
Qiang Zhang
37
2
0
07 Oct 2024
Task Diversity Shortens the ICL Plateau
Jaeyeon Kim
Sehyun Kwon
Joo Young Choi
Jongho Park
Jaewoong Cho
Jason D. Lee
Ernest K. Ryu
MoMe
31
2
0
07 Oct 2024
GAMformer: In-Context Learning for Generalized Additive Models
Andreas Mueller
Julien N. Siems
Harsha Nori
David Salinas
Arber Zela
Rich Caruana
Frank Hutter
AI4CE
33
3
0
06 Oct 2024
Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information
Yongheng Zhang
Qiguang Chen
Jingxuan Zhou
Peng Wang
Jiasheng Si
Jin Wang
Wenpeng Lu
Libo Qin
LRM
46
3
0
06 Oct 2024
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu
Weihao Yu
Xinchao Wang
VLM
37
6
0
25 Sep 2024
Attention Heads of Large Language Models: A Survey
Zifan Zheng
Yezhaohui Wang
Yuxin Huang
Shichao Song
Mingchuan Yang
Bo Tang
Feiyu Xiong
Zhiyu Li
LRM
58
21
0
05 Sep 2024
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
63
1
0
15 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
82
19
0
02 Jul 2024
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
Xiaolei Wang
Xinyu Tang
Wayne Xin Zhao
Ji-Rong Wen
25
3
0
20 Jun 2024
Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning
Hui Liu
Wenya Wang
Hao Sun
Chris Xing Tian
Chenqi Kong
Xin Dong
Haoliang Li
54
5
0
14 Jun 2024
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
58
13
0
13 Jun 2024
Iteration Head: A Mechanistic Study of Chain-of-Thought
Vivien A. Cabannes
Charles Arnal
Wassim Bouaziz
Alice Yang
Francois Charton
Julia Kempe
LRM
27
7
0
04 Jun 2024
Why Larger Language Models Do In-context Learning Differently?
Zhenmei Shi
Junyi Wei
Zhuoyan Xu
Yingyu Liang
37
18
0
30 May 2024
IM-Context: In-Context Learning for Imbalanced Regression Tasks
Ismail Nejjar
Faez Ahmed
Olga Fink
27
1
0
28 May 2024
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
Suraj Anand
Michael A. Lepori
Jack Merullo
Ellie Pavlick
CLL
31
6
0
28 May 2024
Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification
Shang Liu
Zhongze Cai
Guanting Chen
Xiaocheng Li
UQCV
46
1
0
24 May 2024
Linking In-context Learning in Transformers to Human Episodic Memory
Ji-An Li
Corey Y. Zhou
M. Benna
Marcelo G. Mattar
32
3
0
23 May 2024
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
C. Pehlevan
26
10
0
20 May 2024
Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models
Ang Lv
Yuhan Chen
Kaiyi Zhang
Yulong Wang
Lifeng Liu
Ji-Rong Wen
Jian Xie
Rui Yan
KELM
32
16
0
28 Mar 2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Benjamin L. Edelman
Ezra Edelman
Surbhi Goel
Eran Malach
Nikolaos Tsilivis
BDL
26
42
0
16 Feb 2024
A phase transition between positional and semantic learning in a solvable model of dot-product attention
Hugo Cui
Freya Behrens
Florent Krzakala
Lenka Zdeborová
MLT
27
11
0
06 Feb 2024
Is Mamba Capable of In-Context Learning?
Riccardo Grazzi
Julien N. Siems
Simon Schrodi
Thomas Brox
Frank Hutter
24
40
0
05 Feb 2024
On Catastrophic Inheritance of Large Foundation Models
Hao Chen
Bhiksha Raj
Xing Xie
Jindong Wang
AI4CE
56
12
0
02 Feb 2024
Anchor function: a type of benchmark functions for studying language models
Zhongwang Zhang
Zhiwei Wang
Junjie Yao
Zhangchen Zhou
Xiaolong Li
E. Weinan
Z. Xu
37
5
0
16 Jan 2024
The Transient Nature of Emergent In-Context Learning in Transformers
Aaditya K. Singh
Stephanie C. Y. Chan
Ted Moskovitz
Erin Grant
Andrew M. Saxe
Felix Hill
67
32
0
14 Nov 2023
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems
David T. Hoffmann
Simon Schrodi
Jelena Bratulić
Nadine Behrmann
Volker Fischer
Thomas Brox
30
5
0
19 Oct 2023
Breaking through the learning plateaus of in-context learning in Transformer
Jingwen Fu
Tao Yang
Yuwang Wang
Yan Lu
Nanning Zheng
30
1
0
12 Sep 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
212
496
0
01 Nov 2022
1