Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.00208
Cited By
v1
v2 (latest)
What Formal Languages Can Transformers Express? A Survey
1 November 2023
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"What Formal Languages Can Transformers Express? A Survey"
23 / 23 papers shown
Title
Two Heads Are Better than One: Simulating Large Transformers with Small Ones
Hantao Yu
Josh Alman
51
0
0
13 Jun 2025
Sequential-Parallel Duality in Prefix Scannable Models
Morris Yau
Sharut Gupta
Valerie Engelmayer
Kazuki Irie
Stefanie Jegelka
Jacob Andreas
138
0
0
12 Jun 2025
Characterizing the Expressivity of Transformer Language Models
Jiaoda Li
Ryan Cotterell
47
2
0
29 May 2025
Born a Transformer -- Always a Transformer?
Yana Veitsman
Mayank Jobanputra
Yash Sarrof
Aleksandra Bakalova
Vera Demberg
Ellie Pavlick
Michael Hahn
100
0
0
27 May 2025
Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
Charles London
Varun Kanade
63
0
0
27 May 2025
Exact Expressive Power of Transformers with Padding
William Merrill
Ashish Sabharwal
99
0
0
25 May 2025
Mechanistic evaluation of Transformers and state space models
Aryaman Arora
Neil Rathi
Nikil Roashan Selvam
Róbert Csordás
Dan Jurafsky
Christopher Potts
116
1
0
21 May 2025
NoPE: The Counting Power of Transformers with No Positional Encodings
Chris Köcher
Alexander Kozachinskiy
Anthony Widjaja Lin
Marco Sälzer
Georg Zetzsche
138
0
0
16 May 2025
Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models
Hector Pasten
Felipe Urrutia
Hector Jimenez
Cristian B. Calderon
Cristóbal Rojas
Alexander Kozachinskiy
122
0
0
15 May 2025
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel
Kiran Tomlinson
Adith Swaminathan
Jennifer Neville
LRM
158
2
0
13 May 2025
Exploring Compositional Generalization (in ReCOGS_pos) by Transformers using Restricted Access Sequence Processing (RASP)
William Bruns
110
0
0
21 Apr 2025
Meta-Learning Neural Mechanisms rather than Bayesian Priors
Michael Goodale
Salvador Mascarenhas
Yair Lakretz
160
1
0
20 Mar 2025
Unique Hard Attention: A Tale of Two Sides
Selim Jerad
Anej Svete
Jiaoda Li
Ryan Cotterell
103
0
0
18 Mar 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
191
0
0
13 Mar 2025
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Zhenyi Shen
Hanqi Yan
Linhai Zhang
Zhanghao Hu
Yali Du
Yulan He
LRM
195
27
0
28 Feb 2025
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Michael Y. Hu
Jackson Petty
Chuan Shi
William Merrill
Tal Linzen
AI4CE
141
2
0
26 Feb 2025
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Yuri Kuratov
M. Arkhipov
Aydar Bulatov
Andrey Kravchenko
163
3
0
18 Feb 2025
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
227
26
0
19 Nov 2024
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Ryan Cotterell
Brian DuSell
NAI
133
7
0
11 Nov 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
125
5
0
24 Oct 2024
Fundamental Limitations on Subquadratic Alternatives to Transformers
Josh Alman
Hantao Yu
153
4
0
05 Oct 2024
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
182
1
0
15 Jul 2024
The Illusion of State in State-Space Models
William Merrill
Jackson Petty
Ashish Sabharwal
133
58
0
12 Apr 2024
1