Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.10749
Cited By
Transformers Learn Shortcuts to Automata
19 October 2022
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers Learn Shortcuts to Automata"
50 / 136 papers shown
Title
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel
Kiran Tomlinson
Adith Swaminathan
Jennifer Neville
LRM
30
0
0
13 May 2025
Partial Answer of How Transformers Learn Automata
Tiantian
32
0
0
29 Apr 2025
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan
Chen Henry Wu
Charles Ding
Aditi Raghunathan
36
0
0
21 Apr 2025
Provable Failure of Language Models in Learning Majority Boolean Logic via Gradient Descent
Bo Chen
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
NAI
LRM
AI4CE
53
1
0
07 Apr 2025
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
46
0
0
29 Mar 2025
Graph neural networks extrapolate out-of-distribution for shortest paths
Robert Nerem
Samantha Chen
Sanjoy Dasgupta
Yusu Wang
42
0
0
24 Mar 2025
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
William Merrill
Ashish Sabharwal
55
4
0
05 Mar 2025
(How) Do Language Models Track State?
Belinda Z. Li
Zifan Carl Guo
Jacob Andreas
LRM
46
0
0
04 Mar 2025
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
Gilad Yehudai
Noah Amsel
Joan Bruna
LRM
60
1
0
03 Mar 2025
Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers
Gilad Yehudai
Clayton Sanford
Maya Bechler-Speicher
Orr Fischer
Ran Gilad-Bachrach
Amir Globerson
55
0
0
03 Mar 2025
Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization
Ru Wang
Wei Huang
Selena Song
Haoyu Zhang
Yusuke Iwasawa
Y. Matsuo
Jiaxian Guo
OODD
LRM
69
2
0
25 Feb 2025
Ask, and it shall be given: On the Turing completeness of prompting
Ruizhong Qiu
Zhe Xu
W. Bao
Hanghang Tong
ReLM
LRM
AI4CE
65
0
0
24 Feb 2025
Reasoning with Latent Thoughts: On the Power of Looped Transformers
Nikunj Saunshi
Nishanth Dikkala
Zhiyuan Li
Sanjiv Kumar
Sashank J. Reddi
OffRL
LRM
AI4CE
56
10
0
24 Feb 2025
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
37
0
0
24 Feb 2025
On Computational Limits of FlowAR Models: Expressivity and Efficiency
Chengyue Gong
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
63
3
0
23 Feb 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
96
18
0
21 Feb 2025
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
57
1
0
17 Feb 2025
Transformers versus the EM Algorithm in Multi-class Clustering
Yihan He
Hong-Yu Chen
Yuan Cao
Jianqing Fan
Han Liu
55
0
0
09 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
171
0
0
04 Feb 2025
Circuit Complexity Bounds for Visual Autoregressive Model
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
45
5
0
08 Jan 2025
ICLR: In-Context Learning of Representations
Core Francisco Park
Andrew Lee
Ekdeep Singh Lubana
Yongyi Yang
Maya Okawa
Kento Nishi
Martin Wattenberg
Hidenori Tanaka
AIFin
118
3
0
29 Dec 2024
Theoretical Constraints on the Expressive Power of
R
o
P
E
\mathsf{RoPE}
RoPE
-based Tensor Attention Transformers
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Mingda Wan
84
8
0
23 Dec 2024
Theoretical limitations of multi-layer Transformer
Lijie Chen
Binghui Peng
Hongxun Wu
AI4CE
72
6
0
04 Dec 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg K.H. Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
92
11
0
19 Nov 2024
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Ryan Cotterell
Brian DuSell
NAI
36
2
0
11 Nov 2024
Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence
İlker Işık
R. G. Cinbis
Ebru Aydin Gol
28
0
0
22 Oct 2024
Learning Linear Attention in Polynomial Time
Morris Yau
Ekin Akyürek
Jiayuan Mao
Joshua B. Tenenbaum
Stefanie Jegelka
Jacob Andreas
17
2
0
14 Oct 2024
Can Transformers Reason Logically? A Study in SAT Solving
Leyan Pan
Vijay Ganesh
Jacob Abernethy
Chris Esposo
Wenke Lee
ReLM
LRM
31
0
0
09 Oct 2024
Extracting Finite State Machines from Transformers
Rik Adriaensen
Jaron Maene
AI4CE
29
0
0
08 Oct 2024
Fundamental Limitations on Subquadratic Alternatives to Transformers
Josh Alman
Hantao Yu
23
1
0
05 Oct 2024
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets
Yuandong Tian
52
0
0
02 Oct 2024
Positional Attention: Expressivity and Learnability of Algorithmic Computation
Artur Back de Luca
George Giapitzakis
Shenghao Yang
Petar Veličković
K. Fountoulakis
44
0
0
02 Oct 2024
Transformers in Uniform TC
0
^0
0
David Chiang
28
3
0
20 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
116
82
0
18 Sep 2024
Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles
Kulin Shah
Nishanth Dikkala
Xin Wang
Rina Panigrahy
ELM
ReLM
LRM
34
9
0
16 Sep 2024
Transformers As Approximations of Solomonoff Induction
Nathan Young
Michael Witbrock
26
0
0
22 Aug 2024
Learning Randomized Algorithms with Transformers
J. Oswald
Seijin Kobayashi
Yassir Akram
Angelika Steger
AAML
40
0
0
20 Aug 2024
How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression
Xingwu Chen
Lei Zhao
Difan Zou
41
6
0
08 Aug 2024
LLMs as Probabilistic Minimally Adequate Teachers for DFA Learning
Lekai Chen
Ashutosh Trivedi
Alvaro Velasquez
21
1
0
06 Aug 2024
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines
Yuchen Li
Alexandre Kirchmeyer
Aashay Mehta
Yilong Qin
Boris Dadachev
Kishore Papineni
Sanjiv Kumar
Andrej Risteski
59
0
0
22 Jul 2024
Mechanistically Interpreting a Transformer-based 2-SAT Solver: An Axiomatic Approach
Nils Palumbo
Ravi Mangal
Zifan Wang
Saranya Vijayakumar
Corina S. Pasareanu
Somesh Jha
41
1
0
18 Jul 2024
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
63
1
0
15 Jul 2024
Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond
Yingcong Li
A. S. Rawat
Samet Oymak
23
6
0
13 Jul 2024
Algorithmic Language Models with Neurally Compiled Libraries
Lucas Saldyt
Subbarao Kambhampati
LRM
51
0
0
06 Jul 2024
Universal Length Generalization with Turing Programs
Kaiying Hou
David Brandfonbrener
Sham Kakade
Samy Jelassi
Eran Malach
44
7
0
03 Jul 2024
Transformer Normalisation Layers and the Independence of Semantic Subspaces
S. Menary
Samuel Kaski
Andre Freitas
44
2
0
25 Jun 2024
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Anton Xue
Avishree Khare
Rajeev Alur
Surbhi Goel
Eric Wong
53
2
0
21 Jun 2024
Separations in the Representational Capabilities of Transformers and Recurrent Architectures
S. Bhattamishra
Michael Hahn
Phil Blunsom
Varun Kanade
GNN
36
9
0
13 Jun 2024
On Limitation of Transformer for Learning HMMs
Jiachen Hu
Qinghua Liu
Chi Jin
42
3
0
06 Jun 2024
Evaluating the World Model Implicit in a Generative Model
Keyon Vafa
Justin Y. Chen
Jon M. Kleinberg
S. Mullainathan
Ashesh Rambachan
86
27
0
06 Jun 2024
1
2
3
Next