Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.02098
Cited By
Neural Networks and the Chomsky Hierarchy
5 July 2022
Grégoire Delétang
Anian Ruoss
Jordi Grau-Moya
Tim Genewein
L. Wenliang
Elliot Catt
Chris Cundy
Marcus Hutter
Shane Legg
Joel Veness
Pedro A. Ortega
UQCV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural Networks and the Chomsky Hierarchy"
30 / 30 papers shown
Title
Recursive Decomposition with Dependencies for Generic Divide-and-Conquer Reasoning
Sergio Hernández-Gutiérrez
Minttu Alakuijala
Alexander Nikitin
Pekka Marttinen
LRM
60
2
0
05 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
46
0
0
02 May 2025
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking
Yifan Zhang
Wenyu Du
Dongming Jin
Jie Fu
Zhi Jin
LRM
50
0
0
27 Feb 2025
Distributional Scaling Laws for Emergent Capabilities
Rosie Zhao
Tian Qin
David Alvarez-Melis
Sham Kakade
Naomi Saphra
LRM
39
0
0
24 Feb 2025
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
92
11
0
19 Nov 2024
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Ryan Cotterell
Brian DuSell
NAI
41
2
0
11 Nov 2024
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
M. Bueno
R. Lotufo
Rodrigo Nogueira
LRM
26
0
0
08 Oct 2024
Can Transformers Learn
n
n
n
-gram Language Models?
Anej Svete
Nadav Borenstein
M. Zhou
Isabelle Augenstein
Ryan Cotterell
39
6
0
03 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
Kevin Xu
Issei Sato
39
3
0
02 Oct 2024
Revisiting Random Walks for Learning on Graphs
Jinwoo Kim
Olga Zaghen
Ayhan Suleymanzade
Youngmin Ryou
Seunghoon Hong
62
0
0
01 Jul 2024
Separations in the Representational Capabilities of Transformers and Recurrent Architectures
S. Bhattamishra
Michael Hahn
Phil Blunsom
Varun Kanade
GNN
41
9
0
13 Jun 2024
A Tensor Decomposition Perspective on Second-order RNNs
M. Lizaire
Michael Rizvi-Martel
Marawan Gamal Abdel Hameed
Guillaume Rabusseau
50
0
0
07 Jun 2024
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
Nadav Borenstein
Anej Svete
R. Chan
Josef Valvoda
Franz Nowak
Isabelle Augenstein
Eleanor Chodroff
Ryan Cotterell
42
11
0
06 Jun 2024
The CLRS-Text Algorithmic Reasoning Language Benchmark
Larisa Markeeva
Sean McLeish
Borja Ibarz
Wilfried Bounsi
Olga Kozlova
Alex Vitvitskyi
Charles Blundell
Tom Goldstein
Avi Schwarzschild
Petar Veličković
LRM
36
12
0
06 Jun 2024
Transformers as Transducers
Lena Strobl
Dana Angluin
David Chiang
Jonathan Rawski
Ashish Sabharwal
27
5
0
02 Apr 2024
Neural Redshift: Random Networks are not Random Functions
Damien Teney
A. Nicolicioiu
Valentin Hartmann
Ehsan Abbasnejad
100
18
0
04 Mar 2024
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
39
1
0
01 Feb 2024
Learning Universal Predictors
Jordi Grau-Moya
Tim Genewein
Marcus Hutter
Laurent Orseau
Grégoire Delétang
...
Anian Ruoss
Wenliang Kevin Li
Christopher Mattern
Matthew Aitchison
J. Veness
24
11
0
26 Jan 2024
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Anej Svete
Ryan Cotterell
32
2
0
08 Oct 2023
Language Modeling Is Compression
Grégoire Delétang
Anian Ruoss
Paul-Ambroise Duquenne
Elliot Catt
Tim Genewein
...
Wenliang Kevin Li
Matthew Aitchison
Laurent Orseau
Marcus Hutter
J. Veness
AI4CE
32
131
0
19 Sep 2023
Curricular Transfer Learning for Sentence Encoded Tasks
Jader Martins Camboim de Sá
Matheus Ferraroni Sanches
R. R. Souza
Júlio Cesar dos Reis
Leandro A. Villas
21
0
0
03 Aug 2023
Mini-Giants: "Small" Language Models and Open Source Win-Win
Zhengping Zhou
Lezhi Li
Xinxi Chen
Andy Li
SyDa
ALM
MoE
26
6
0
17 Jul 2023
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
29
15
0
23 May 2023
Empirical Analysis of the Inductive Bias of Recurrent Neural Networks by Discrete Fourier Transform of Output Sequences
Taiga Ishii
Ryo Ueda
Yusuke Miyao
21
0
0
16 May 2023
General-Purpose In-Context Learning by Meta-Learning Transformers
Louis Kirsch
James Harrison
Jascha Narain Sohl-Dickstein
Luke Metz
34
72
0
08 Dec 2022
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
S. Bhattamishra
Arkil Patel
Varun Kanade
Phil Blunsom
14
44
0
22 Nov 2022
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
46
156
0
19 Oct 2022
Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks
Yuxuan Li
James L. McClelland
39
17
0
02 Oct 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
695
0
27 Aug 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
243
4,469
0
23 Jan 2020
1