ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.17709
  4. Cited By
Case-Based or Rule-Based: How Do Transformers Do the Math?

Case-Based or Rule-Based: How Do Transformers Do the Math?

27 February 2024
Yi Hu
Xiaojuan Tang
Haotong Yang
Muhan Zhang
    LRM
ArXivPDFHTML

Papers citing "Case-Based or Rule-Based: How Do Transformers Do the Math?"

23 / 23 papers shown
Title
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
77
3
0
17 Feb 2025
Number Cookbook: Number Understanding of Language Models and How to Improve It
Number Cookbook: Number Understanding of Language Models and How to Improve It
Haotong Yang
Yi Hu
Shijia Kang
Zhouchen Lin
Muhan Zhang
LRM
72
6
0
06 Nov 2024
RuleRAG: Rule-Guided Retrieval-Augmented Generation with Language Models for Question Answering
RuleRAG: Rule-Guided Retrieval-Augmented Generation with Language Models for Question Answering
Zhongwu Chen
Chengjin Xu
Dingmin Wang
Zhen Huang
Yong Dou
Xuhui Jiang
Jian Guo
RALM
381
2
0
15 Oct 2024
LooGLE: Can Long-Context Language Models Understand Long Contexts?
LooGLE: Can Long-Context Language Models Understand Long Contexts?
Jiaqi Li
Mengmeng Wang
Zilong Zheng
Muhan Zhang
ELM
RALM
52
130
0
08 Nov 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning
  Capabilities of Language Models
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Buse Giledereli
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
87
46
0
23 Oct 2023
Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained
  Large Language Models with Template-Content Structure
Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure
Haotong Yang
Fanxu Meng
Zhouchen Lin
Muhan Zhang
LRM
61
3
0
09 Oct 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
267
11,791
0
18 Jul 2023
Reasoning or Reciting? Exploring the Capabilities and Limitations of
  Language Models Through Counterfactual Tasks
Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks
Zhaofeng Wu
Linlu Qiu
Alexis Ross
Ekin Akyürek
Boyuan Chen
Bailin Wang
Najoung Kim
Jacob Andreas
Yoon Kim
LRM
ReLM
157
214
0
05 Jul 2023
The Impact of Positional Encoding on Length Generalization in
  Transformers
The Impact of Positional Encoding on Length Generalization in Transformers
Amirhossein Kazemnejad
Inkit Padhi
Karthikeyan N. Ramamurthy
Payel Das
Siva Reddy
68
195
0
31 May 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
113
678
0
23 May 2023
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of
  Commonsense Problem in Large Language Models
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models
Ning Bian
Xianpei Han
Le Sun
Hongyu Lin
Yaojie Lu
Xianpei Han
Shanshan Jiang
Bin Dong
KELM
ELM
AI4MH
LRM
89
78
0
29 Mar 2023
Why Can GPT Learn In-Context? Language Models Implicitly Perform
  Gradient Descent as Meta-Optimizers
Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers
Damai Dai
Yutao Sun
Li Dong
Y. Hao
Shuming Ma
Zhifang Sui
Furu Wei
LRM
66
166
0
20 Dec 2022
Transformers learn in-context by gradient descent
Transformers learn in-context by gradient descent
J. Oswald
Eyvind Niklasson
E. Randazzo
João Sacramento
A. Mordvintsev
A. Zhmoginov
Max Vladymyrov
MLT
88
487
0
15 Dec 2022
What learning algorithm is in-context learning? Investigations with
  linear models
What learning algorithm is in-context learning? Investigations with linear models
Ekin Akyürek
Dale Schuurmans
Jacob Andreas
Tengyu Ma
Denny Zhou
80
481
0
28 Nov 2022
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
Tushar Khot
H. Trivedi
Matthew Finlayson
Yao Fu
Kyle Richardson
Peter Clark
Ashish Sabharwal
ReLM
LRM
101
441
0
05 Oct 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function
  Classes
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg
Dimitris Tsipras
Percy Liang
Gregory Valiant
116
504
0
01 Aug 2022
Exploring Length Generalization in Large Language Models
Exploring Length Generalization in Large Language Models
Cem Anil
Yuhuai Wu
Anders Andreassen
Aitor Lewkowycz
Vedant Misra
V. Ramasesh
Ambrose Slone
Guy Gur-Ari
Ethan Dyer
Behnam Neyshabur
ReLM
LRM
76
167
0
11 Jul 2022
Beyond the Imitation Game: Quantifying and extrapolating the
  capabilities of language models
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Aarohi Srivastava
Abhinav Rastogi
Abhishek Rao
Abu Awal Md Shoeb
Abubakar Abid
...
Zhuoye Zhao
Zijian Wang
Zijie J. Wang
Zirui Wang
Ziyi Wu
ELM
144
1,746
0
09 Jun 2022
Least-to-Most Prompting Enables Complex Reasoning in Large Language
  Models
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
Denny Zhou
Nathanael Scharli
Le Hou
Jason W. Wei
Nathan Scales
...
Dale Schuurmans
Claire Cui
Olivier Bousquet
Quoc Le
Ed H. Chi
RALM
LRM
AI4CE
70
1,101
0
21 May 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
420
6,202
0
05 Apr 2022
LaMDA: Language Models for Dialog Applications
LaMDA: Language Models for Dialog Applications
R. Thoppilan
Daniel De Freitas
Jamie Hall
Noam M. Shazeer
Apoorv Kulshreshtha
...
Blaise Aguera-Arcas
Claire Cui
M. Croak
Ed H. Chi
Quoc Le
ALM
122
1,589
0
20 Jan 2022
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
612
41,736
0
28 May 2020
Are Transformers universal approximators of sequence-to-sequence
  functions?
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
101
352
0
20 Dec 2019
1