ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.22076
  4. Cited By
Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)

Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)

28 March 2025
Lena Strobl
Dana Angluin
Robert Frank
ArXiv (abs)PDFHTML

Papers citing "Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)"

6 / 6 papers shown
Title
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
Alexander Kozachinskiy
Felipe Urrutia
Hector Jimenez
Tomasz Steifer
Germán Pizarro
Matías Fuentes
Francisco Meza
Cristian Buc
Cristóbal Rojas
155
2
0
31 Jan 2025
Simulating Hard Attention Using Soft Attention
Simulating Hard Attention Using Soft Attention
Andy Yang
Lena Strobl
David Chiang
Dana Angluin
93
3
0
13 Dec 2024
One-layer transformers fail to solve the induction heads task
One-layer transformers fail to solve the induction heads task
Clayton Sanford
Daniel J. Hsu
Matus Telgarsky
91
12
0
26 Aug 2024
On Limitations of the Transformer Architecture
On Limitations of the Transformer Architecture
Binghui Peng
Srini Narayanan
Christos H. Papadimitriou
77
37
0
13 Feb 2024
Are Transformers with One Layer Self-Attention Using Low-Rank Weight
  Matrices Universal Approximators?
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
T. Kajitsuka
Issei Sato
110
18
0
26 Jul 2023
Are Transformers universal approximators of sequence-to-sequence
  functions?
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
124
359
0
20 Dec 2019
1