ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.15400
  4. Cited By
Length Generalization in Arithmetic Transformers

Length Generalization in Arithmetic Transformers

27 June 2023
Samy Jelassi
Stéphane dÁscoli
Carles Domingo-Enrich
Yuhuai Wu
Yuan-Fang Li
Franccois Charton
ArXiv (abs)PDFHTML

Papers citing "Length Generalization in Arithmetic Transformers"

13 / 13 papers shown
Title
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Tianqi Du
Haotian Huang
Yifei Wang
Yisen Wang
21
0
0
13 Jun 2025
Born a Transformer -- Always a Transformer?
Born a Transformer -- Always a Transformer?
Yana Veitsman
Mayank Jobanputra
Yash Sarrof
Aleksandra Bakalova
Vera Demberg
Ellie Pavlick
Michael Hahn
61
0
0
27 May 2025
Graph neural networks extrapolate out-of-distribution for shortest paths
Graph neural networks extrapolate out-of-distribution for shortest paths
Robert Nerem
Samantha Chen
Sanjoy Dasgupta
Yusu Wang
100
1
0
24 Mar 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLMVLMLRMAI4CE
169
7
0
03 Feb 2025
Mathematical Language Models: A Survey
Mathematical Language Models: A Survey
Wen Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
235
14
0
03 Jan 2025
Rethinking Associative Memory Mechanism in Induction Head
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang
Issei Sato
185
0
0
16 Dec 2024
Quantifying artificial intelligence through algorithmic generalization
Quantifying artificial intelligence through algorithmic generalization
Takuya Ito
Murray Campbell
L. Horesh
Tim Klinger
Parikshit Ram
ELM
124
0
0
08 Nov 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
113
5
0
24 Oct 2024
Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization
Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization
Yang Chen
Long Yang
Yitao Liang
Zhouchen Lin
119
1
0
11 Oct 2024
Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks
Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks
Xingcheng Xu
Zibo Zhao
Haipeng Zhang
Yanqing Yang
LRM
91
0
0
25 Jul 2024
Transformers Can Achieve Length Generalization But Not Robustly
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
122
43
0
14 Feb 2024
Adaptivity and Modularity for Efficient Generalization Over Task
  Complexity
Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Samira Abnar
Omid Saremi
Laurent Dinh
Shantel Wilson
Miguel Angel Bautista
...
Vimal Thilak
Etai Littwin
Jiatao Gu
Josh Susskind
Samy Bengio
108
6
0
13 Oct 2023
GPT Can Solve Mathematical Problems Without a Calculator
GPT Can Solve Mathematical Problems Without a Calculator
Zhiyong Yang
Ming Ding
Qingsong Lv
Zhihuan Jiang
Zehai He
Yuyi Guo
Jinfeng Bai
Jie Tang
RALMLRM
114
56
0
06 Sep 2023
1