Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.17399
Cited By
Transformers Can Do Arithmetic with the Right Embeddings
27 May 2024
Sean McLeish
Arpit Bansal
Alex Stein
Neel Jain
John Kirchenbauer
Brian Bartoldson
B. Kailkhura
A. Bhatele
Jonas Geiping
Avi Schwarzschild
Tom Goldstein
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers Can Do Arithmetic with the Right Embeddings"
11 / 11 papers shown
Title
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models
Haoyang Li
Xuejia Chen
Zhanchao Xu
Darian Li
Nicole Hu
...
Heng Chang
Luyu Qiu
C. Zhang
Qing Li
Lei Chen
LRM
ELM
40
1
0
16 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
180
0
0
04 Feb 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLM
VLM
LRM
AI4CE
80
4
0
03 Feb 2025
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
30
4
0
24 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
145
0
0
07 Oct 2024
Eliminating Position Bias of Language Models: A Mechanistic Approach
Ziqi Wang
Hanlin Zhang
Xiner Li
Kuan-Hao Huang
Chi Han
Shuiwang Ji
Sham Kakade
Hao Peng
Heng Ji
57
12
0
01 Jul 2024
Large Language Models as Surrogate Models in Evolutionary Algorithms: A Preliminary Study
Hao Hao
Xiaoqun Zhang
Aimin Zhou
ELM
38
9
0
15 Jun 2024
Discrete Neural Algorithmic Reasoning
Gleb Rodionov
Liudmila Prokhorenkova
OOD
NAI
42
3
0
18 Feb 2024
Simulation of Graph Algorithms with Looped Transformers
Artur Back de Luca
K. Fountoulakis
55
14
0
02 Feb 2024
The CLRS Algorithmic Reasoning Benchmark
Petar Velivcković
Adria Puigdomenech Badia
David Budden
Razvan Pascanu
Andrea Banino
Mikhail Dashevskiy
R. Hadsell
Charles Blundell
161
88
0
31 May 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
250
695
0
27 Aug 2021
1