Counting Like Transformers: Compiling Temporal Counting Logic Into Softmax Transformers

5 April 2024

Papers citing "Counting Like Transformers: Compiling Temporal Counting Logic Into Softmax Transformers"

9 / 9 papers shown

Title
Tracr-Injection: Distilling Algorithms into Pre-trained Language Models Tomás Vergara-Browne Álvaro Soto 12 0 0 15 May 2025
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases Michael Y. Hu Jackson Petty Chuan Shi William Merrill Tal Linzen AI4CE 66 1 0 26 Feb 2025
Ehrenfeucht-Haussler Rank and Chain of Thought Pablo Barceló Alexander Kozachinskiy Tomasz Steifer LRM 79 1 0 22 Jan 2025
Transformers in Uniform TC $^0$ David Chiang 33 3 0 20 Sep 2024
Language Models Need Inductive Biases to Count Inductively Yingshan Chang Yonatan Bisk LRM 32 5 0 30 May 2024
The Expressive Capacity of State Space Models: A Formal Language Perspective Yash Sarrof Yana Veitsman Michael Hahn Mamba 35 8 0 27 May 2024
Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages Andy Yang David Chiang Dana Angluin 28 15 0 21 Oct 2023
On the Expressivity Role of LayerNorm in Transformers' Attention Shaked Brody Shiyu Jin Xinghao Zhu MoE 63 30 0 04 May 2023
A Logic for Expressing Log-Precision Transformers William Merrill Ashish Sabharwal ReLM NAI LRM 48 47 0 06 Oct 2022