Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.05249
Cited By
In-Context Convergence of Transformers
8 October 2023
Yu Huang
Yuan Cheng
Yingbin Liang
MLT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"In-Context Convergence of Transformers"
18 / 18 papers shown
Title
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
Yixiao Huang
Hanlin Zhu
Tianyu Guo
Jiantao Jiao
Somayeh Sojoudi
Michael I. Jordan
Stuart Russell
Song Mei
LRM
159
0
0
12 Jun 2025
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality
Ruhan Wang
Zhiyong Wang
Chengkai Huang
Rui Wang
Tong Yu
Lina Yao
John C. S. Lui
Dongruo Zhou
24
0
0
09 Jun 2025
Minimalist Softmax Attention Provably Learns Constrained Boolean Functions
Jerry Yao-Chieh Hu
Xiwen Zhang
Maojiang Su
Zhao Song
Han Liu
MLT
245
1
0
26 May 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
Ming Wang
Sijia Liu
Pin-Yu Chen
MoMe
266
10
0
15 Apr 2025
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li
Chenyang Zhang
Xingwu Chen
Yuan Cao
Difan Zou
136
2
0
24 Feb 2025
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Zixuan Gong
Xiaolin Hu
Huayi Tang
Yong Liu
158
0
0
24 Feb 2025
Transformers versus the EM Algorithm in Multi-class Clustering
Yihan He
Hong-Yu Chen
Yuan Cao
Jianqing Fan
Han Liu
105
2
0
09 Feb 2025
Training Dynamics of In-Context Learning in Linear Attention
Yedi Zhang
Aaditya K. Singh
Peter E. Latham
Andrew Saxe
MLT
149
5
0
27 Jan 2025
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang
Issei Sato
189
0
0
16 Dec 2024
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu
Ruida Zhou
Cong Shen
Jing Yang
144
0
0
17 Oct 2024
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures
Wei Shen
Ruida Zhou
Jing Yang
Cong Shen
84
4
0
15 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
154
22
0
15 Oct 2024
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Hongkang Li
Songtao Lu
Pin-Yu Chen
Xiaodong Cui
Meng Wang
LRM
105
6
0
03 Oct 2024
Transformers Handle Endogeneity in In-Context Linear Regression
Haodong Liang
Krishnakumar Balasubramanian
Lifeng Lai
147
2
0
02 Oct 2024
Spin glass model of in-context learning
Yuhao Li
Ruoran Bai
Haiping Huang
LRM
156
0
0
05 Aug 2024
Superiority of Multi-Head Attention in In-Context Linear Regression
Yingqian Cui
Jie Ren
Pengfei He
Jiliang Tang
Yue Xing
100
15
0
30 Jan 2024
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
130
24
0
28 Jan 2024
Transformers are Provably Optimal In-context Estimators for Wireless Communications
Vishnu Teja Kunde
Vicram Rajagopalan
Chandra Shekhara Kaushik Valmeekam
Krishna R. Narayanan
S. Shakkottai
D. Kalathil
J. Chamberland
144
6
0
01 Nov 2023
1