ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.14168
  4. Cited By
Training Verifiers to Solve Math Word Problems

Training Verifiers to Solve Math Word Problems

27 October 2021
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
Lukasz Kaiser
Matthias Plappert
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
    ReLM
    OffRL
    LRM
ArXivPDFHTML

Papers citing "Training Verifiers to Solve Math Word Problems"

50 / 3,031 papers shown
Title
FERMAT: An Alternative to Accuracy for Numerical Reasoning
FERMAT: An Alternative to Accuracy for Numerical Reasoning
Jasivan Sivakumar
N. Moosavi
ReLM
LRM
43
3
0
27 May 2023
Matrix Information Theory for Self-Supervised Learning
Matrix Information Theory for Self-Supervised Learning
Yifan Zhang
Zhi-Hao Tan
Jingqin Yang
Weiran Huang
Yang Yuan
SSL
48
18
0
27 May 2023
Chain-of-Thought Hub: A Continuous Effort to Measure Large Language
  Models' Reasoning Performance
Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance
Yao Fu
Litu Ou
Mingyu Chen
Yuhao Wan
Hao-Chun Peng
Tushar Khot
LLMAG
ELM
LRM
ReLM
33
109
0
26 May 2023
Large Language Models as Tool Makers
Large Language Models as Tool Makers
Tianle Cai
Xuezhi Wang
Tengyu Ma
Xinyun Chen
Denny Zhou
LLMAG
37
186
0
26 May 2023
Learning and Leveraging Verifiers to Improve Planning Capabilities of
  Pre-trained Language Models
Learning and Leveraging Verifiers to Improve Planning Capabilities of Pre-trained Language Models
Daman Arora
Subbarao Kambhampati
LRM
23
11
0
26 May 2023
MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of
  Thought Prompting
MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting
Tatsuro Inaba
Hirokazu Kiyomaru
Fei Cheng
Sadao Kurohashi
KELM
LRM
26
23
0
26 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
35
69
0
25 May 2023
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
Guanzhi Wang
Yuqi Xie
Yunfan Jiang
Ajay Mandlekar
Chaowei Xiao
Yuke Zhu
Linxi Fan
Anima Anandkumar
LM&Ro
SyDa
63
757
0
25 May 2023
Gorilla: Large Language Model Connected with Massive APIs
Gorilla: Large Language Model Connected with Massive APIs
Shishir G. Patil
Tianjun Zhang
Xin Wang
Joseph E. Gonzalez
ELM
CLL
ALM
SyDa
22
518
0
24 May 2023
EvEval: A Comprehensive Evaluation of Event Semantics for Large Language
  Models
EvEval: A Comprehensive Evaluation of Event Semantics for Large Language Models
Zhengwei Tao
Zhi Jin
Xiaoying Bai
Haiyan Zhao
Yanlin Feng
Jia Li
Wenpeng Hu
37
4
0
24 May 2023
Revisiting Parallel Context Windows: A Frustratingly Simple Alternative
  and Chain-of-Thought Deterioration
Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration
Kejuan Yang
Xiao Liu
Kaiwen Men
Aohan Zeng
Yuxiao Dong
Jie Tang
LLMAG
LRM
21
3
0
24 May 2023
Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For
  Large Language Models
Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models
Daman Arora
H. Singh
Mausam
ELM
LRM
33
50
0
24 May 2023
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models
  using Causal Mediation Analysis
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis
Alessandro Stolfo
Yonatan Belinkov
Mrinmaya Sachan
MILM
KELM
LRM
35
50
0
24 May 2023
Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through
  Interaction with Symbolic Systems
Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems
Marek Kadlcík
Michal Štefánik
Ondřej Sotolář
Vlastimil Martinek
LRM
22
13
0
24 May 2023
Reasoning with Language Model is Planning with World Model
Reasoning with Language Model is Planning with World Model
Shibo Hao
Yi Gu
Haodi Ma
Joshua Jiahua Hong
Zhen Wang
D. Wang
Zhiting Hu
ReLM
LRM
LLMAG
68
519
0
24 May 2023
GRACE: Discriminator-Guided Chain-of-Thought Reasoning
GRACE: Discriminator-Guided Chain-of-Thought Reasoning
Muhammad Khalifa
Lajanugen Logeswaran
Moontae Lee
Ho Hin Lee
Lu Wang
LRM
32
37
0
24 May 2023
Coverage-based Example Selection for In-Context Learning
Coverage-based Example Selection for In-Context Learning
Shivanshu Gupta
Matt Gardner
Sameer Singh
31
40
0
24 May 2023
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for
  Large Language Models
Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models
Sheng Shen
Le Hou
Yan-Quan Zhou
Nan Du
Shayne Longpre
...
Vincent Zhao
Hongkun Yu
Kurt Keutzer
Trevor Darrell
Denny Zhou
ALM
MoE
38
54
0
24 May 2023
Don't Trust ChatGPT when Your Question is not in English: A Study of
  Multilingual Abilities and Types of LLMs
Don't Trust ChatGPT when Your Question is not in English: A Study of Multilingual Abilities and Types of LLMs
Xiang Zhang
Senyu Li
B. Hauer
Ning Shi
Grzegorz Kondrak
LRM
33
81
0
24 May 2023
Getting MoRE out of Mixture of Language Model Reasoning Experts
Getting MoRE out of Mixture of Language Model Reasoning Experts
Chenglei Si
Weijia Shi
Chen Zhao
Luke Zettlemoyer
Jordan L. Boyd-Graber
LRM
26
24
0
24 May 2023
PEARL: Prompting Large Language Models to Plan and Execute Actions Over
  Long Documents
PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents
Simeng Sun
Y. Liu
Shuohang Wang
Chenguang Zhu
Mohit Iyyer
RALM
LRM
ReLM
33
51
0
23 May 2023
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties
  Grounded in Math Reasoning Problems
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems
Jakub Macina
Nico Daheim
Sankalan Pal Chowdhury
Tanmay Sinha
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
LRM
22
55
0
23 May 2023
Deduction under Perturbed Evidence: Probing Student Simulation
  Capabilities of Large Language Models
Deduction under Perturbed Evidence: Probing Student Simulation Capabilities of Large Language Models
Shashank Sonkar
Richard G. Baraniuk
24
1
0
23 May 2023
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement
  Learning
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning
Alexander Scarlatos
Andrew S. Lan
OffRL
LRM
29
20
0
23 May 2023
Self-Polish: Enhance Reasoning in Large Language Models via Problem
  Refinement
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
Zhiheng Xi
Senjie Jin
Yuhao Zhou
Rui Zheng
Songyang Gao
Tao Gui
Qi Zhang
Xuanjing Huang
ReLM
LRM
44
44
0
23 May 2023
Language Model Self-improvement by Reinforcement Learning Contemplation
Language Model Self-improvement by Reinforcement Learning Contemplation
Jing-Cheng Pang
Pengyuan Wang
Kaiyuan Li
Xiong-Hui Chen
Jiacheng Xu
Zongzhang Zhang
Yang Yu
LRM
KELM
15
43
0
23 May 2023
Automatic Model Selection with Large Language Models for Reasoning
Automatic Model Selection with Large Language Models for Reasoning
Xu Zhao
Yuxi Xie
Kenji Kawaguchi
Junxian He
Qizhe Xie
ReLM
LRM
39
39
0
23 May 2023
Improving Factuality and Reasoning in Language Models through Multiagent
  Debate
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAG
LRM
49
614
0
23 May 2023
Question Answering as Programming for Solving Time-Sensitive Questions
Question Answering as Programming for Solving Time-Sensitive Questions
Xinyu Zhu
Cheng Yang
B. Chen
Siheng Li
Jian-Guang Lou
Yujiu Yang
KELM
38
11
0
23 May 2023
Skill-Based Few-Shot Selection for In-Context Learning
Skill-Based Few-Shot Selection for In-Context Learning
Shengnan An
Bo Zhou
Zeqi Lin
Qiang Fu
B. Chen
Nanning Zheng
Weizhu Chen
Jian-Guang Lou
36
31
0
23 May 2023
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
Tiedong Liu
K. H. Low
ALM
37
81
0
23 May 2023
Better Zero-Shot Reasoning with Self-Adaptive Prompting
Better Zero-Shot Reasoning with Self-Adaptive Prompting
Xingchen Wan
Ruoxi Sun
H. Dai
Sercan Ö. Arik
Tomas Pfister
ReLM
OffRL
LRM
18
48
0
23 May 2023
Out-of-Distribution Generalization in Text Classification: Past,
  Present, and Future
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future
Linyi Yang
Yangqiu Song
Xuan Ren
Chenyang Lyu
Yidong Wang
Lingqiao Liu
Jindong Wang
Jennifer Foster
Yue Zhang
OOD
37
2
0
23 May 2023
The CoT Collection: Improving Zero-shot and Few-shot Learning of
  Language Models via Chain-of-Thought Fine-Tuning
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Seungone Kim
Se June Joo
Doyoung Kim
Joel Jang
Seonghyeon Ye
Jamin Shin
Minjoon Seo
ALM
RALM
LRM
23
96
0
23 May 2023
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better
  than Chain-of-thought Fine-tuning
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning
Xuekai Zhu
Biqing Qi
Kaiyan Zhang
Xingwei Long
Zhouhan Lin
Bowen Zhou
ALM
LRM
41
19
0
23 May 2023
Can Large Language Models Capture Dissenting Human Voices?
Can Large Language Models Capture Dissenting Human Voices?
Noah Lee
Na Min An
James Thorne
ALM
44
30
0
23 May 2023
ReWOO: Decoupling Reasoning from Observations for Efficient Augmented
  Language Models
ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models
Binfeng Xu
Zhiyuan Peng
Bowen Lei
Subhabrata Mukherjee
Yuchen Liu
Dongkuan Xu
KELM
LLMAG
LRM
32
90
0
23 May 2023
Small Language Models Improve Giants by Rewriting Their Outputs
Small Language Models Improve Giants by Rewriting Their Outputs
Giorgos Vernikos
Arthur Bravzinskas
Jakub Adamek
Jonathan Mallinson
Aliaksei Severyn
Eric Malmi
BDL
LRM
33
14
0
22 May 2023
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken
  Language Understanding
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
Mutian He
Philip N. Garner
ELM
AI4MH
LRM
48
21
0
22 May 2023
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via
  Debate
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate
Boshi Wang
Xiang Yue
Huan Sun
ELM
LRM
46
60
0
22 May 2023
Should We Attend More or Less? Modulating Attention for Fairness
Should We Attend More or Less? Modulating Attention for Fairness
A. Zayed
Gonçalo Mordido
Samira Shabanian
Sarath Chandar
37
10
0
22 May 2023
Making Language Models Better Tool Learners with Execution Feedback
Making Language Models Better Tool Learners with Execution Feedback
Shuofei Qiao
Honghao Gui
Chengfei Lv
Qianghuai Jia
Huajun Chen
Ningyu Zhang
LLMAG
46
46
0
22 May 2023
Table Meets LLM: Can Large Language Models Understand Structured Table
  Data? A Benchmark and Empirical Study
Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study
Yuan Sui
Mengyu Zhou
Mingjie Zhou
Shi Han
Dongmei Zhang
LMTD
24
72
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
90
562
0
22 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large
  Language Models
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Rada Mihalcea
LRM
41
6
0
21 May 2023
TheoremQA: A Theorem-driven Question Answering dataset
TheoremQA: A Theorem-driven Question Answering dataset
Wenhu Chen
Ming Yin
Max W.F. Ku
Pan Lu
Yixin Wan
Xueguang Ma
Jianyu Xu
Xinyi Wang
Tony Xia
AIMat
38
124
0
21 May 2023
PiVe: Prompting with Iterative Verification Improving Graph-based
  Generative Capability of LLMs
PiVe: Prompting with Iterative Verification Improving Graph-based Generative Capability of LLMs
Jiuzhou Han
Nigel Collier
Wray L. Buntine
Ehsan Shareghi
75
36
0
21 May 2023
Logic-LM: Empowering Large Language Models with Symbolic Solvers for
  Faithful Logical Reasoning
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning
Liangming Pan
Alon Albalak
Xinyi Wang
William Yang Wang
ReLM
LRM
AI4CE
49
234
0
20 May 2023
Revisiting the Architectures like Pointer Networks to Efficiently
  Improve the Next Word Distribution, Summarization Factuality, and Beyond
Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond
Haw-Shiuan Chang
Zonghai Yao
Alolika Gon
Hong-ye Yu
Andrew McCallum
46
10
0
20 May 2023
VNHSGE: VietNamese High School Graduation Examination Dataset for Large
  Language Models
VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models
Dao Xuan-Quy
Le Ngoc-Bich
Vo The-Duy
Phan Xuan-Dung
Ngo Bac-Bien
Nguyen Van-Tien
Nguyen Thi-My-Thanh
Nguyen Hong-Phuoc
24
16
0
20 May 2023
Previous
123...555657...596061
Next