Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.15009
Cited By
One-Layer Transformers are Provably Optimal for In-context Reasoning and Distributional Association Learning in Next-Token Prediction Tasks
21 May 2025
Quan Nguyen
Thanh Nguyen-Tang
MLT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"One-Layer Transformers are Provably Optimal for In-context Reasoning and Distributional Association Learning in Next-Token Prediction Tasks"
1 / 1 papers shown
Title
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
117
21
0
08 Feb 2024
1