ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.00208
  4. Cited By
What Formal Languages Can Transformers Express? A Survey
v1v2 (latest)

What Formal Languages Can Transformers Express? A Survey

1 November 2023
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "What Formal Languages Can Transformers Express? A Survey"

23 / 23 papers shown
Title
Two Heads Are Better than One: Simulating Large Transformers with Small Ones
Two Heads Are Better than One: Simulating Large Transformers with Small Ones
Hantao Yu
Josh Alman
51
0
0
13 Jun 2025
Sequential-Parallel Duality in Prefix Scannable Models
Sequential-Parallel Duality in Prefix Scannable Models
Morris Yau
Sharut Gupta
Valerie Engelmayer
Kazuki Irie
Stefanie Jegelka
Jacob Andreas
138
0
0
12 Jun 2025
Characterizing the Expressivity of Transformer Language Models
Characterizing the Expressivity of Transformer Language Models
Jiaoda Li
Ryan Cotterell
47
2
0
29 May 2025
Born a Transformer -- Always a Transformer?
Born a Transformer -- Always a Transformer?
Yana Veitsman
Mayank Jobanputra
Yash Sarrof
Aleksandra Bakalova
Vera Demberg
Ellie Pavlick
Michael Hahn
100
0
0
27 May 2025
Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
Charles London
Varun Kanade
63
0
0
27 May 2025
Exact Expressive Power of Transformers with Padding
Exact Expressive Power of Transformers with Padding
William Merrill
Ashish Sabharwal
99
0
0
25 May 2025
Mechanistic evaluation of Transformers and state space models
Mechanistic evaluation of Transformers and state space models
Aryaman Arora
Neil Rathi
Nikil Roashan Selvam
Róbert Csordás
Dan Jurafsky
Christopher Potts
116
1
0
21 May 2025
NoPE: The Counting Power of Transformers with No Positional Encodings
NoPE: The Counting Power of Transformers with No Positional Encodings
Chris Köcher
Alexander Kozachinskiy
Anthony Widjaja Lin
Marco Sälzer
Georg Zetzsche
138
0
0
16 May 2025
Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models
Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models
Hector Pasten
Felipe Urrutia
Hector Jimenez
Cristian B. Calderon
Cristóbal Rojas
Alexander Kozachinskiy
122
0
0
15 May 2025
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel
Kiran Tomlinson
Adith Swaminathan
Jennifer Neville
LRM
158
2
0
13 May 2025
Exploring Compositional Generalization (in ReCOGS_pos) by Transformers using Restricted Access Sequence Processing (RASP)
Exploring Compositional Generalization (in ReCOGS_pos) by Transformers using Restricted Access Sequence Processing (RASP)
William Bruns
110
0
0
21 Apr 2025
Meta-Learning Neural Mechanisms rather than Bayesian Priors
Meta-Learning Neural Mechanisms rather than Bayesian Priors
Michael Goodale
Salvador Mascarenhas
Yair Lakretz
160
1
0
20 Mar 2025
Unique Hard Attention: A Tale of Two Sides
Unique Hard Attention: A Tale of Two Sides
Selim Jerad
Anej Svete
Jiaoda Li
Ryan Cotterell
103
0
0
18 Mar 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
191
0
0
13 Mar 2025
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Zhenyi Shen
Hanqi Yan
Linhai Zhang
Zhanghao Hu
Yali Du
Yulan He
LRM
195
27
0
28 Feb 2025
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Michael Y. Hu
Jackson Petty
Chuan Shi
William Merrill
Tal Linzen
AI4CE
141
2
0
26 Feb 2025
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Yuri Kuratov
M. Arkhipov
Aydar Bulatov
Andrey Kravchenko
163
3
0
18 Feb 2025
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
227
26
0
19 Nov 2024
Training Neural Networks as Recognizers of Formal Languages
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Ryan Cotterell
Brian DuSell
NAI
133
7
0
11 Nov 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
125
5
0
24 Oct 2024
Fundamental Limitations on Subquadratic Alternatives to Transformers
Fundamental Limitations on Subquadratic Alternatives to Transformers
Josh Alman
Hantao Yu
153
4
0
05 Oct 2024
Representing Rule-based Chatbots with Transformers
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
182
1
0
15 Jul 2024
The Illusion of State in State-Space Models
The Illusion of State in State-Space Models
William Merrill
Jackson Petty
Ashish Sabharwal
133
58
0
12 Apr 2024
1