Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.22076
Cited By
Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)
28 March 2025
Lena Strobl
Dana Angluin
Robert Frank
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)"
6 / 6 papers shown
Title
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
Alexander Kozachinskiy
Felipe Urrutia
Hector Jimenez
Tomasz Steifer
Germán Pizarro
Matías Fuentes
Francisco Meza
Cristian Buc
Cristóbal Rojas
155
2
0
31 Jan 2025
Simulating Hard Attention Using Soft Attention
Andy Yang
Lena Strobl
David Chiang
Dana Angluin
89
3
0
13 Dec 2024
One-layer transformers fail to solve the induction heads task
Clayton Sanford
Daniel J. Hsu
Matus Telgarsky
91
12
0
26 Aug 2024
On Limitations of the Transformer Architecture
Binghui Peng
Srini Narayanan
Christos H. Papadimitriou
77
37
0
13 Feb 2024
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
T. Kajitsuka
Issei Sato
107
18
0
26 Jul 2023
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
121
358
0
20 Dec 2019
1