ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.05249
  4. Cited By
In-Context Convergence of Transformers

In-Context Convergence of Transformers

8 October 2023
Yu Huang
Yuan Cheng
Yingbin Liang
    MLT
ArXiv (abs)PDFHTML

Papers citing "In-Context Convergence of Transformers"

18 / 18 papers shown
Title
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
Yixiao Huang
Hanlin Zhu
Tianyu Guo
Jiantao Jiao
Somayeh Sojoudi
Michael I. Jordan
Stuart Russell
Song Mei
LRM
159
0
0
12 Jun 2025
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality
Ruhan Wang
Zhiyong Wang
Chengkai Huang
Rui Wang
Tong Yu
Lina Yao
John C. S. Lui
Dongruo Zhou
24
0
0
09 Jun 2025
Minimalist Softmax Attention Provably Learns Constrained Boolean Functions
Minimalist Softmax Attention Provably Learns Constrained Boolean Functions
Jerry Yao-Chieh Hu
Xiwen Zhang
Maojiang Su
Zhao Song
Han Liu
MLT
245
1
0
26 May 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
Ming Wang
Sijia Liu
Pin-Yu Chen
MoMe
266
10
0
15 Apr 2025
On the Robustness of Transformers against Context Hijacking for Linear Classification
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li
Chenyang Zhang
Xingwu Chen
Yuan Cao
Difan Zou
136
2
0
24 Feb 2025
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Zixuan Gong
Xiaolin Hu
Huayi Tang
Yong Liu
158
0
0
24 Feb 2025
Transformers versus the EM Algorithm in Multi-class Clustering
Yihan He
Hong-Yu Chen
Yuan Cao
Jianqing Fan
Han Liu
105
2
0
09 Feb 2025
Training Dynamics of In-Context Learning in Linear Attention
Training Dynamics of In-Context Learning in Linear Attention
Yedi Zhang
Aaditya K. Singh
Peter E. Latham
Andrew Saxe
MLT
149
5
0
27 Jan 2025
Rethinking Associative Memory Mechanism in Induction Head
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang
Issei Sato
189
0
0
16 Dec 2024
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu
Ruida Zhou
Cong Shen
Jing Yang
144
0
0
17 Oct 2024
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures
On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures
Wei Shen
Ruida Zhou
Jing Yang
Cong Shen
84
4
0
15 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
154
22
0
15 Oct 2024
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Hongkang Li
Songtao Lu
Pin-Yu Chen
Xiaodong Cui
Meng Wang
LRM
105
6
0
03 Oct 2024
Transformers Handle Endogeneity in In-Context Linear Regression
Transformers Handle Endogeneity in In-Context Linear Regression
Haodong Liang
Krishnakumar Balasubramanian
Lifeng Lai
147
2
0
02 Oct 2024
Spin glass model of in-context learning
Spin glass model of in-context learning
Yuhao Li
Ruoran Bai
Haiping Huang
LRM
156
0
0
05 Aug 2024
Superiority of Multi-Head Attention in In-Context Linear Regression
Superiority of Multi-Head Attention in In-Context Linear Regression
Yingqian Cui
Jie Ren
Pengfei He
Jiliang Tang
Yue Xing
100
15
0
30 Jan 2024
An Information-Theoretic Analysis of In-Context Learning
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
130
24
0
28 Jan 2024
Transformers are Provably Optimal In-context Estimators for Wireless Communications
Transformers are Provably Optimal In-context Estimators for Wireless Communications
Vishnu Teja Kunde
Vicram Rajagopalan
Chandra Shekhara Kaushik Valmeekam
Krishna R. Narayanan
S. Shakkottai
D. Kalathil
J. Chamberland
144
6
0
01 Nov 2023
1