When Can Transformers Count to n?

When Can Transformers Count to n?

21 July 2024

Asma Ghandeharioun

Papers citing "When Can Transformers Count to n?"

12 / 12 papers shown

Title
Tracr-Injection: Distilling Algorithms into Pre-trained Language Models Tomás Vergara-Browne Álvaro Soto 150 0 0 15 May 2025
StringLLM: Understanding the String Processing Capability of Large Language Models Xilong Wang Hao Fu Jindong Wang Neil Zhenqiang Gong 134 0 0 28 Jan 2025
More Expressive Attention with Negative Weights Ang Lv Ruobing Xie Shuaipeng Li Jiayi Liao Xingwu Sun Zhanhui Kang Di Wang Rui Yan 78 1 0 11 Nov 2024
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems Nan Xu Xuezhe Ma LRM 107 5 0 18 Oct 2024
From Introspection to Best Practices: Principled Analysis of Demonstrations in Multimodal In-Context Learning Nan Xu Fei Wang Sheng Zhang Hoifung Poon Muhao Chen 94 7 0 01 Jul 2024
On Limitations of the Transformer Architecture Binghui Peng Srini Narayanan Christos H. Papadimitriou 67 36 0 13 Feb 2024
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective Guhao Feng Bohang Zhang Yuntian Gu Haotian Ye Di He Liwei Wang LRM 84 246 0 24 May 2023
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 307 514 0 24 Sep 2022
Exploring Length Generalization in Large Language Models Cem Anil Yuhuai Wu Anders Andreassen Aitor Lewkowycz Vedant Misra V. Ramasesh Ambrose Slone Guy Gur-Ari Ethan Dyer Behnam Neyshabur ReLM LRM 78 168 0 11 Jul 2022
The Parallelism Tradeoff: Limitations of Log-Precision Transformers William Merrill Ashish Sabharwal 59 112 0 02 Jul 2022
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models Aarohi Srivastava Abhinav Rastogi Abhishek Rao Abu Awal Md Shoeb Abubakar Abid ... Zhuoye Zhao Zijian Wang Zijie J. Wang Zirui Wang Ziyi Wu ELM 171 1,748 0 09 Jun 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 768 9,351 0 28 Jan 2022