ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03874
  4. Cited By
Measuring Mathematical Problem Solving With the MATH Dataset

Measuring Mathematical Problem Solving With the MATH Dataset

5 March 2021
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
D. Song
Jacob Steinhardt
    ReLM
    FaML
ArXivPDFHTML

Papers citing "Measuring Mathematical Problem Solving With the MATH Dataset"

50 / 1,408 papers shown
Title
Plansformer: Generating Symbolic Plans using Transformers
Plansformer: Generating Symbolic Plans using Transformers
Vishal Pallagani
Bharath Muppasani
K. Murugesan
F. Rossi
L. Horesh
Biplav Srivastava
F. Fabiano
Andrea Loreggia
LM&Ro
LLMAG
OffRL
17
35
0
16 Dec 2022
ALERT: Adapting Language Models to Reasoning Tasks
ALERT: Adapting Language Models to Reasoning Tasks
Ping Yu
Tianlu Wang
O. Yu. Golovneva
Badr AlKhamissi
Siddharth Verma
Zhijing Jin
Gargi Ghosh
Mona T. Diab
Asli Celikyilmaz
ReLM
LRM
34
21
0
16 Dec 2022
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
O. Yu. Golovneva
Moya Chen
Spencer Poff
Martin Corredor
Luke Zettlemoyer
Maryam Fazel-Zarandi
Asli Celikyilmaz
ReLM
LRM
22
138
0
15 Dec 2022
Despite "super-human" performance, current LLMs are unsuited for
  decisions about ethics and safety
Despite "super-human" performance, current LLMs are unsuited for decisions about ethics and safety
Joshua Albrecht
Ellie Kitanidis
Abraham J. Fetterman
ELM
ReLM
ALM
LRM
27
17
0
13 Dec 2022
Solving math word problems with process- and outcome-based feedback
Solving math word problems with process- and outcome-based feedback
J. Uesato
Nate Kushman
Ramana Kumar
Francis Song
Noah Y. Siegel
L. Wang
Antonia Creswell
G. Irving
I. Higgins
FaML
ReLM
AIMat
LRM
33
283
0
25 Nov 2022
Program of Thoughts Prompting: Disentangling Computation from Reasoning
  for Numerical Reasoning Tasks
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Wenhu Chen
Xueguang Ma
Xinyi Wang
William W. Cohen
ReLM
ReCod
LRM
66
732
0
22 Nov 2022
PAL: Program-aided Language Models
PAL: Program-aided Language Models
Luyu Gao
Aman Madaan
Shuyan Zhou
Uri Alon
Pengfei Liu
Yiming Yang
Jamie Callan
Graham Neubig
ReLM
LRM
29
413
0
18 Nov 2022
Galactica: A Large Language Model for Science
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELM
ReLM
32
727
0
16 Nov 2022
Teaching Algorithmic Reasoning via In-context Learning
Teaching Algorithmic Reasoning via In-context Learning
Hattie Zhou
Azade Nova
Hugo Larochelle
Aaron C. Courville
Behnam Neyshabur
Hanie Sedghi
LRM
ReLM
30
108
0
15 Nov 2022
Logical Tasks for Measuring Extrapolation and Rule Comprehension
Logical Tasks for Measuring Extrapolation and Rule Comprehension
Ippei Fujisawa
Ryota Kanai
ELM
LRM
28
4
0
14 Nov 2022
Development of a Neural Network-Based Mathematical Operation Protocol
  for Embedded Hexadecimal Digits Using Neural Architecture Search (NAS)
Development of a Neural Network-Based Mathematical Operation Protocol for Embedded Hexadecimal Digits Using Neural Architecture Search (NAS)
Victor Robila
Kexin Pei
Junfeng Yang
16
0
0
12 Nov 2022
A Simple, Yet Effective Approach to Finding Biases in Code Generation
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Spyridon Mouselinos
Mateusz Malinowski
Henryk Michalewski
12
7
0
31 Oct 2022
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal
  Proofs
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Q. Jiang
Sean Welleck
Jin Peng Zhou
Wenda Li
Jiacheng Liu
M. Jamnik
Timothée Lacroix
Yuhuai Wu
Guillaume Lample
AIMat
70
158
0
21 Oct 2022
A Causal Framework to Quantify the Robustness of Mathematical Reasoning
  with Language Models
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models
Alessandro Stolfo
Zhijing Jin
Kumar Shridhar
Bernhard Schölkopf
Mrinmaya Sachan
ELM
OOD
LRM
35
61
0
21 Oct 2022
Prompting GPT-3 To Be Reliable
Prompting GPT-3 To Be Reliable
Chenglei Si
Zhe Gan
Zhengyuan Yang
Shuohang Wang
Jianfeng Wang
Jordan L. Boyd-Graber
Lijuan Wang
KELM
LRM
50
279
0
17 Oct 2022
Can Language Representation Models Think in Bets?
Can Language Representation Models Think in Bets?
Zhi–Bin Tang
Mayank Kejriwal
15
6
0
14 Oct 2022
Learning to Reason With Relational Abstractions
Learning to Reason With Relational Abstractions
A. Nam
Mengye Ren
Chelsea Finn
James L. McClelland
ReLM
LRM
37
4
0
06 Oct 2022
When to Make Exceptions: Exploring Language Models as Accounts of Human
  Moral Judgment
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
Zhijing Jin
Sydney Levine
Fernando Gonzalez
Ojasv Kamal
Maarten Sap
Mrinmaya Sachan
Rada Mihalcea
J. Tenenbaum
Bernhard Schölkopf
ELM
LRM
28
90
0
04 Oct 2022
Augmenting Operations Research with Auto-Formulation of Optimization
  Models from Problem Descriptions
Augmenting Operations Research with Auto-Formulation of Optimization Models from Problem Descriptions
Rindranirina Ramamonjison
Haley Li
Timothy T. Yu
Shiqi He
Vishnu Rengan
Amin Banitalebi-Dehkordi
Zirui Zhou
Yong Zhang
40
31
0
30 Sep 2022
Limits of an AI program for solving college math problems
Limits of an AI program for solving college math problems
E. Davis
AIMat
17
3
0
14 Aug 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying Chen
Xinyan Xiao
Jing Liu
Hua-Hong Wu
37
4
0
28 Jul 2022
Exploring Length Generalization in Large Language Models
Exploring Length Generalization in Large Language Models
Cem Anil
Yuhuai Wu
Anders Andreassen
Aitor Lewkowycz
Vedant Misra
V. Ramasesh
Ambrose Slone
Guy Gur-Ari
Ethan Dyer
Behnam Neyshabur
ReLM
LRM
33
158
0
11 Jul 2022
Machine Learning Model Sizes and the Parameter Gap
Machine Learning Model Sizes and the Parameter Gap
Pablo Villalobos
J. Sevilla
T. Besiroglu
Lennart Heim
A. Ho
Marius Hobbhahn
ALM
ELM
AI4CE
27
57
0
05 Jul 2022
Forecasting Future World Events with Neural Networks
Forecasting Future World Events with Neural Networks
Andy Zou
Tristan Xiao
Ryan Jia
Joe Kwon
Mantas Mazeika
Richard Li
Dawn Song
Jacob Steinhardt
Owain Evans
Dan Hendrycks
24
22
0
30 Jun 2022
Solving Quantitative Reasoning Problems with Language Models
Solving Quantitative Reasoning Problems with Language Models
Aitor Lewkowycz
Anders Andreassen
David Dohan
Ethan Dyer
Henryk Michalewski
...
Theo Gutman-Solo
Yuhuai Wu
Behnam Neyshabur
Guy Gur-Ari
Vedant Misra
ReLM
ELM
LRM
58
739
0
29 Jun 2022
Bridging the Gap Between Indexing and Retrieval for Differentiable
  Search Index with Query Generation
Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation
Shengyao Zhuang
Houxing Ren
Linjun Shou
Jian Pei
Ming Gong
Guido Zuccon
Daxin Jiang
37
64
0
21 Jun 2022
Unveiling Transformers with LEGO: a synthetic reasoning task
Unveiling Transformers with LEGO: a synthetic reasoning task
Yi Zhang
A. Backurs
Sébastien Bubeck
Ronen Eldan
Suriya Gunasekar
Tal Wagner
LRM
28
85
0
09 Jun 2022
MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and
  Textual Data
MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data
Yilun Zhao
Yunxiang Li
Chenying Li
Rui Zhang
AIMat
37
97
0
03 Jun 2022
A Survey in Mathematical Language Processing
A Survey in Mathematical Language Processing
Jordan Meadows
André Freitas
AIMat
19
15
0
30 May 2022
NaturalProver: Grounded Mathematical Proof Generation with Language
  Models
NaturalProver: Grounded Mathematical Proof Generation with Language Models
Sean Welleck
Jiacheng Liu
Ximing Lu
Hannaneh Hajishirzi
Yejin Choi
AIMat
LRM
22
65
0
25 May 2022
Autoformalization with Large Language Models
Autoformalization with Large Language Models
Yuhuai Wu
Albert Q. Jiang
Wenda Li
M. Rabe
Charles Staats
M. Jamnik
Christian Szegedy
AI4CE
110
156
0
25 May 2022
TALM: Tool Augmented Language Models
TALM: Tool Augmented Language Models
Aaron T Parisi
Yao-Min Zhao
Noah Fiedel
KELM
RALM
LLMAG
29
144
0
24 May 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
72
801
0
14 Apr 2022
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning
  Tasks
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Swaroop Mishra
Arindam Mitra
Neeraj Varshney
Bhavdeep Singh Sachdeva
Peter Clark
Chitta Baral
A. Kalyan
AIMat
ReLM
ELM
LRM
25
102
0
12 Apr 2022
Capturing Failures of Large Language Models via Human Cognitive Biases
Capturing Failures of Large Language Models via Human Cognitive Biases
Erik Jones
Jacob Steinhardt
28
88
0
24 Feb 2022
GPT-based Open-Ended Knowledge Tracing
GPT-based Open-Ended Knowledge Tracing
Naiming Liu
Zichao Wang
Richard G. Baraniuk
Andrew S. Lan
AI4Ed
32
3
0
21 Feb 2022
Deconstructing Distributions: A Pointwise Framework of Learning
Deconstructing Distributions: A Pointwise Framework of Learning
Gal Kaplun
Nikhil Ghosh
Saurabh Garg
Boaz Barak
Preetum Nakkiran
OOD
25
21
0
20 Feb 2022
Formal Mathematics Statement Curriculum Learning
Formal Mathematics Statement Curriculum Learning
Stanislas Polu
Jesse Michael Han
Kunhao Zheng
Mantas Baksys
Igor Babuschkin
Ilya Sutskever
AIMat
84
116
0
03 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
367
8,495
0
28 Jan 2022
Towards More Robust Natural Language Understanding
Towards More Robust Natural Language Understanding
Xinliang Frederick Zhang
22
2
0
01 Dec 2021
Solving Probability and Statistics Problems by Program Synthesis
Solving Probability and Statistics Problems by Program Synthesis
Leonard Tang
Elizabeth Ke
Nikhil Singh
Nakul Verma
Iddo Drori
11
15
0
16 Nov 2021
Towards Tractable Mathematical Reasoning: Challenges, Strategies, and
  Opportunities for Solving Math Word Problems
Towards Tractable Mathematical Reasoning: Challenges, Strategies, and Opportunities for Solving Math Word Problems
Keyur Faldu
A. Sheth
Prashant Kikani
Manas Gaur
Aditi Avasthi
LRM
29
17
0
29 Oct 2021
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
29
3,785
0
27 Oct 2021
Pretrained Language Models are Symbolic Mathematics Solvers too!
Pretrained Language Models are Symbolic Mathematics Solvers too!
Kimia Noorbakhsh
Modar Sulaiman
M. Sharifi
Kallol Roy
Pooyan Jamshidi
LRM
22
18
0
07 Oct 2021
TruthfulQA: Measuring How Models Mimic Human Falsehoods
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Stephanie C. Lin
Jacob Hilton
Owain Evans
HILM
34
1,735
0
08 Sep 2021
Teaching Autoregressive Language Models Complex Tasks By Demonstration
Teaching Autoregressive Language Models Complex Tasks By Demonstration
Gabriel Recchia
26
22
0
05 Sep 2021
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Kunhao Zheng
Jesse Michael Han
Stanislas Polu
AIMat
27
151
0
31 Aug 2021
Systematic human learning and generalization from a brief tutorial with
  explanatory feedback
Systematic human learning and generalization from a brief tutorial with explanatory feedback
A. Nam
James L. McClelland
16
1
0
10 Jul 2021
Solving Machine Learning Problems
Solving Machine Learning Problems
Sunny Tran
P. Krishna
Ishan Pakuwal
Prabhakar Kafle
Nikhil Singh
J. Lynch
Iddo Drori
VLM
13
11
0
02 Jul 2021
Effect of Pre-Training Scale on Intra- and Inter-Domain Full and
  Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images
Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images
Mehdi Cherti
J. Jitsev
LM&MA
22
23
0
31 May 2021
Previous
123...272829
Next