ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.14168
  4. Cited By
Training Verifiers to Solve Math Word Problems

Training Verifiers to Solve Math Word Problems

27 October 2021
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
Lukasz Kaiser
Matthias Plappert
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
    ReLM
    OffRL
    LRM
ArXivPDFHTML

Papers citing "Training Verifiers to Solve Math Word Problems"

50 / 3,034 papers shown
Title
Large Language Models Only Pass Primary School Exams in Indonesia: A
  Comprehensive Test on IndoMMLU
Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU
Fajri Koto
Nurul Aisyah
Haonan Li
Timothy Baldwin
AI4Ed
LRM
ELM
35
38
0
07 Oct 2023
Critique Ability of Large Language Models
Critique Ability of Large Language Models
Liangchen Luo
Zi Lin
Yinxiao Liu
Lei Shu
Yun Zhu
Jingbo Shang
Lei Meng
AI4MH
LRM
ELM
24
14
0
07 Oct 2023
Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning
  in Large Language Models
Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models
Song Jiang
Zahra Shakeri
Aaron Chan
Maziar Sanjabi
Hamed Firooz
...
Bugra Akyildiz
Yizhou Sun
Jinchao Li
Qifan Wang
Asli Celikyilmaz
LRM
ReLM
26
8
0
07 Oct 2023
Language Agent Tree Search Unifies Reasoning Acting and Planning in
  Language Models
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Xiaoxiao Sun
Yang Yang
Michal Shlapentokh-Rothman
Haohan Wang
Yu-xiong Wang
LRM
AI4CE
LM&Ro
LLMAG
42
184
0
06 Oct 2023
Amortizing intractable inference in large language models
Amortizing intractable inference in large language models
Marvin Schmitt
Moksh Jain
Daniel Habermann
Younesse Kaddar
Ullrich Kothe
Stefan T. Radev
Nikolay Malkin
AIFin
BDL
32
49
0
06 Oct 2023
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Wanyun Cui
Qianle Wang
LRM
47
7
0
06 Oct 2023
Analysis of the Reasoning with Redundant Information Provided Ability of
  Large Language Models
Analysis of the Reasoning with Redundant Information Provided Ability of Large Language Models
Wenbei Xie
LRM
35
2
0
06 Oct 2023
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical
  Reasoning
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Ke Wang
Houxing Ren
Aojun Zhou
Zimu Lu
Sichun Luo
Weikang Shi
Renrui Zhang
Linqi Song
Mingjie Zhan
Hongsheng Li
ReLM
LRM
SyDa
30
95
0
05 Oct 2023
DSPy: Compiling Declarative Language Model Calls into Self-Improving
  Pipelines
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Omar Khattab
Arnav Singhvi
Paridhi Maheshwari
Zhiyuan Zhang
Keshav Santhanam
...
Thomas T. Joshi
Hanna Moazam
Heather Miller
Matei A. Zaharia
Christopher Potts
RALM
38
236
0
05 Oct 2023
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct
  Preference Optimization
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
Zhanhui Zhou
Jie Liu
Chao Yang
Jing Shao
Yu Liu
Xiangyu Yue
Wanli Ouyang
Yu Qiao
40
49
0
05 Oct 2023
Concise and Organized Perception Facilitates Reasoning in Large Language Models
Concise and Organized Perception Facilitates Reasoning in Large Language Models
Junjie Liu
Shaotian Yan
Chen Shen
Zhengdong Xiao
Wenxiao Wang
Jieping Ye
Jieping Ye
LRM
26
1
0
05 Oct 2023
Large Language Model Cascades with Mixture of Thoughts Representations
  for Cost-efficient Reasoning
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning
Murong Yue
Jie Zhao
Min Zhang
Liang Du
Ziyu Yao
LRM
40
57
0
04 Oct 2023
From Words to Watts: Benchmarking the Energy Costs of Large Language
  Model Inference
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference
S. Samsi
Dan Zhao
Joseph McDonald
Baolin Li
Adam Michaleas
Michael Jones
William Bergeron
J. Kepner
Devesh Tiwari
V. Gadepally
21
125
0
04 Oct 2023
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Xianjun Yang
Xiao Wang
Qi Zhang
Linda R. Petzold
William Y. Wang
Xun Zhao
Dahua Lin
26
165
0
04 Oct 2023
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of
  Large Language Models with Misconceptions
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions
Naiming Liu
Shashank Sonkar
Zichao Wang
Simon Woodhead
Richard G. Baraniuk
LRM
AI4Ed
31
14
0
03 Oct 2023
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
E. Zelikman
Eliana Lorch
Lester W. Mackey
Adam Tauman Kalai
LRM
ReLM
43
46
0
03 Oct 2023
MathVista: Evaluating Mathematical Reasoning of Foundation Models in
  Visual Contexts
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu
Hritik Bansal
Tony Xia
Jiacheng Liu
Chun-yue Li
Hannaneh Hajishirzi
Hao Cheng
Kai-Wei Chang
Michel Galley
Jianfeng Gao
LRM
MLLM
43
511
0
03 Oct 2023
Think before you speak: Training Language Models With Pause Tokens
Think before you speak: Training Language Models With Pause Tokens
Sachin Goyal
Ziwei Ji
A. S. Rawat
A. Menon
Sanjiv Kumar
Vaishnavh Nagarajan
LRM
26
97
0
03 Oct 2023
Ask Again, Then Fail: Large Language Models' Vacillations in Judgment
Ask Again, Then Fail: Large Language Models' Vacillations in Judgment
Qiming Xie
Zengzhi Wang
Yi Feng
Rui Xia
AAML
HILM
35
9
0
03 Oct 2023
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology
  View
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View
Jintian Zhang
Xin Xu
Ningyu Zhang
Ruibo Liu
Bryan Hooi
Shumin Deng
LLMAG
44
124
0
03 Oct 2023
Instances Need More Care: Rewriting Prompts for Instances with LLMs in
  the Loop Yields Better Zero-Shot Performance
Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance
Saurabh Srivastava
Chengyue Huang
Weiguo Fan
Ziyu Yao
LLMAG
28
5
0
03 Oct 2023
Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward
  Reasoning in Math Word Problems
Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word Problems
Aniruddha Deb
Neeva Oza
Sarthak Singla
Dinesh Khandelwal
Dinesh Garg
Parag Singla
21
8
0
03 Oct 2023
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
Suyu Ge
Yunan Zhang
Liyuan Liu
Minjia Zhang
Jiawei Han
Jianfeng Gao
4
222
0
03 Oct 2023
Large Language Models Cannot Self-Correct Reasoning Yet
Large Language Models Cannot Self-Correct Reasoning Yet
Jie Huang
Xinyun Chen
Swaroop Mishra
Huaixiu Steven Zheng
Adams Wei Yu
Xinying Song
Denny Zhou
ReLM
LRM
38
424
0
03 Oct 2023
Large Language Models as Analogical Reasoners
Large Language Models as Analogical Reasoners
Michihiro Yasunaga
Xinyun Chen
Yujia Li
Panupong Pasupat
J. Leskovec
Percy Liang
Ed H. Chi
Denny Zhou
ReLM
LRM
29
79
0
03 Oct 2023
RA-DIT: Retrieval-Augmented Dual Instruction Tuning
RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Xi Lin
Xilun Chen
Mingda Chen
Weijia Shi
Maria Lomeli
...
Jacob Kahn
Gergely Szilvasy
Mike Lewis
Luke Zettlemoyer
Scott Yih
RALM
47
133
0
02 Oct 2023
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question
  Games
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games
Yizhe Zhang
Jiarui Lu
Navdeep Jaitly
LRM
ELM
24
10
0
02 Oct 2023
Tool-Augmented Reward Modeling
Tool-Augmented Reward Modeling
Lei Li
Yekun Chai
Shuohuan Wang
Yu Sun
Hao Tian
Ningyu Zhang
Hua Wu
OffRL
46
13
0
02 Oct 2023
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural
  bandits Coupled with Transformers
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
Xiaoqiang Lin
Zhaoxuan Wu
Zhongxiang Dai
Wenyang Hu
Yao Shu
See-Kiong Ng
Patrick Jaillet
Bryan Kian Hsiang Low
32
10
0
02 Oct 2023
TIGERScore: Towards Building Explainable Metric for All Text Generation
  Tasks
TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
Dongfu Jiang
Yishan Li
Ge Zhang
Wenhao Huang
Bill Yuchen Lin
Wenhu Chen
ALM
37
58
0
01 Oct 2023
FELM: Benchmarking Factuality Evaluation of Large Language Models
FELM: Benchmarking Factuality Evaluation of Large Language Models
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
41
33
0
01 Oct 2023
Adaptive-Solver Framework for Dynamic Strategy Selection in Large
  Language Model Reasoning
Adaptive-Solver Framework for Dynamic Strategy Selection in Large Language Model Reasoning
Jianpeng Zhou
Wanjun Zhong
Yanlin Wang
Jiahai Wang
LRM
31
7
0
01 Oct 2023
Adapting LLM Agents with Universal Feedback in Communication
Adapting LLM Agents with Universal Feedback in Communication
Kuan-Chieh Jackson Wang
Yadong Lu
Michael Santacroce
Yeyun Gong
Chao Zhang
Yelong Shen
LLMAG
36
7
0
01 Oct 2023
SELF: Self-Evolution with Language Feedback
SELF: Self-Evolution with Language Feedback
Jianqiao Lu
Wanjun Zhong
Wenyong Huang
Yufei Wang
Qi Zhu
...
Weichao Wang
Xingshan Zeng
Lifeng Shang
Xin Jiang
Qun Liu
LRM
SyDa
29
6
0
01 Oct 2023
UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large
  Language Model Capabilities
UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large Language Model Capabilities
Hejia Geng
Boxun Xu
Peng Li
ELM
LRM
ReLM
41
1
0
30 Sep 2023
Understanding In-Context Learning from Repetitions
Understanding In-Context Learning from Repetitions
Jianhao Yan
Jin Xu
Chiyu Song
Chenming Wu
Yafu Li
Yue Zhang
30
22
0
30 Sep 2023
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model
  Collaboration
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
Qiushi Sun
Zhangyue Yin
Xiang Li
Zhiyong Wu
Xipeng Qiu
Lingpeng Kong
LRM
LLMAG
28
44
0
30 Sep 2023
SocREval: Large Language Models with the Socratic Method for
  Reference-Free Reasoning Evaluation
SocREval: Large Language Models with the Socratic Method for Reference-Free Reasoning Evaluation
Hangfeng He
Hongming Zhang
Dan Roth
LRM
ELM
ReLM
30
14
0
29 Sep 2023
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
Zhibin Gou
Zhihong Shao
Yeyun Gong
Yelong Shen
Yujiu Yang
Minlie Huang
Nan Duan
Weizhu Chen
LRM
AI4CE
LLMAG
61
145
0
29 Sep 2023
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
  Language Models
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models
Ansong Ni
Pengcheng Yin
Yilun Zhao
Chen Wei
Yanjun Wang
...
Mingyuan Zhang
Chen Change Loy
Yingbo Zhou
Dragomir R. Radev
Arman Cohan
ELM
32
18
0
29 Sep 2023
Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind
  Aware GPT-4
Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4
Jiaxian Guo
Bo Yang
Paul D. Yoo
Bill Yuchen Lin
Yusuke Iwasawa
Yutaka Matsuo
LLMAG
21
41
0
29 Sep 2023
Alphazero-like Tree-Search can Guide Large Language Model Decoding and
  Training
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Xidong Feng
Bo Liu
Muning Wen
Stephen Marcus McAleer
Ying Wen
Weinan Zhang
Jun Wang
LRM
AI4CE
38
160
0
29 Sep 2023
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
A. Maritan
Jiaao Chen
S. Dey
Luca Schenato
Diyi Yang
Xing Xie
ELM
LRM
27
43
0
29 Sep 2023
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution
Chrisantha Fernando
Dylan Banarse
Henryk Michalewski
Simon Osindero
Tim Rocktaschel
LLMAG
ReLM
LRM
37
180
0
28 Sep 2023
Stress Testing Chain-of-Thought Prompting for Large Language Models
Stress Testing Chain-of-Thought Prompting for Large Language Models
Aayush Mishra
Jitin Singla
LRM
8
2
0
28 Sep 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
73
1,622
0
28 Sep 2023
GPT-Fathom: Benchmarking Large Language Models to Decipher the
  Evolutionary Path towards GPT-4 and Beyond
GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond
Timothée Darcet
Yuyu Zhang
Yijie Zhu
Chenguang Xi
Pengyang Gao
Piotr Bojanowski
Kevin Chen-Chuan Chang
ELM
35
24
0
28 Sep 2023
Effective Long-Context Scaling of Foundation Models
Effective Long-Context Scaling of Foundation Models
Wenhan Xiong
Jingyu Liu
Igor Molybog
Hejia Zhang
Prajjwal Bhargava
...
Dániel Baráth
Sergey Edunov
Mike Lewis
Sinong Wang
Hao Ma
37
207
0
27 Sep 2023
NLPBench: Evaluating Large Language Models on Solving NLP Problems
NLPBench: Evaluating Large Language Models on Solving NLP Problems
Linxin Song
Jieyu Zhang
Lechao Cheng
Pengyuan Zhou
Dinesh Manocha
Irene Z Li
ELM
LM&MA
LRM
36
10
0
27 Sep 2023
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought
  Reasoning: Advances, Frontiers and Future
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Tao He
Haotian Wang
Weihua Peng
Ming Liu
Bing Qin
Ting Liu
LRM
AI4CE
37
155
0
27 Sep 2023
Previous
123...515253...596061
Next