ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.14168
  4. Cited By
Training Verifiers to Solve Math Word Problems

Training Verifiers to Solve Math Word Problems

27 October 2021
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
Lukasz Kaiser
Matthias Plappert
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
    ReLM
    OffRL
    LRM
ArXivPDFHTML

Papers citing "Training Verifiers to Solve Math Word Problems"

50 / 3,115 papers shown
Title
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
  Math Problems?
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang
Dongzhi Jiang
Yichi Zhang
Haokun Lin
Ziyu Guo
...
Aojun Zhou
Pan Lu
Kai-Wei Chang
Peng Gao
Hongsheng Li
34
180
0
21 Mar 2024
ChainLM: Empowering Large Language Models with Improved Chain-of-Thought
  Prompting
ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting
Xiaoxue Cheng
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
LRM
AI4CE
ReLM
60
7
0
21 Mar 2024
Reinforcement Learning from Reflective Feedback (RLRF): Aligning and
  Improving LLMs via Fine-Grained Self-Reflection
Reinforcement Learning from Reflective Feedback (RLRF): Aligning and Improving LLMs via Fine-Grained Self-Reflection
Kyungjae Lee
Dasol Hwang
Sunghyun Park
Youngsoo Jang
Moontae Lee
48
8
0
21 Mar 2024
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric
  Behavior Understanding by Dyadic Contexts
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts
Guangzeng Han
Weisi Liu
Xiaolei Huang
Brian Borsari
41
21
0
20 Mar 2024
LeanReasoner: Boosting Complex Logical Reasoning with Lean
LeanReasoner: Boosting Complex Logical Reasoning with Lean
Dongwei Jiang
Marcio Fonseca
Shay B. Cohen
LRM
36
14
0
20 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
90
86
0
20 Mar 2024
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic
  Prompt Compression
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Zhuoshi Pan
Qianhui Wu
Huiqiang Jiang
Menglin Xia
Xufang Luo
...
Yuqing Yang
Chin-Yew Lin
H. Vicky Zhao
Lili Qiu
Dongmei Zhang
VLM
55
94
0
19 Mar 2024
Toward Sustainable GenAI using Generation Directives for Carbon-Friendly
  Large Language Model Inference
Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference
Baolin Li
Yankai Jiang
V. Gadepally
Devesh Tiwari
36
15
0
19 Mar 2024
Securing Large Language Models: Threats, Vulnerabilities and Responsible
  Practices
Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices
Sara Abdali
Richard Anarfi
C. Barberan
Jia He
PILM
73
25
0
19 Mar 2024
RankPrompt: Step-by-Step Comparisons Make Language Models Better
  Reasoners
RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners
Chi Hu
Yuan Ge
Xiangnan Ma
Hang Cao
Qiang Li
Yonghua Yang
Tong Xiao
Jingbo Zhu
ReLM
ELM
LRM
ALM
45
9
0
19 Mar 2024
VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
Yongshuo Zong
Ondrej Bohdal
Timothy M. Hospedales
30
5
0
19 Mar 2024
RouterBench: A Benchmark for Multi-LLM Routing System
RouterBench: A Benchmark for Multi-LLM Routing System
Qitian Jason Hu
Jacob Bieker
Xiuyu Li
Nan Jiang
Benjamin Keigwin
Gaurav Ranganath
Kurt Keutzer
Shriyash Kaustubh Upadhyay
54
39
0
18 Mar 2024
What Makes Math Word Problems Challenging for LLMs?
What Makes Math Word Problems Challenging for LLMs?
KV Aditya Srivatsa
Ekaterina Kochmar
47
14
0
17 Mar 2024
BEnQA: A Question Answering and Reasoning Benchmark for Bengali and
  English
BEnQA: A Question Answering and Reasoning Benchmark for Bengali and English
H. M. Q. H. Sheikh Shafayat
Rishav Hada
Isaac Cowhey
Rifki Afina
Jerry Tworek
Lorie De Leon
37
3
0
16 Mar 2024
Self-Consistency Boosts Calibration for Math Reasoning
Self-Consistency Boosts Calibration for Math Reasoning
Ante Wang
Linfeng Song
Ye Tian
Baolin Peng
Lifeng Jin
Haitao Mi
Jinsong Su
Dong Yu
LRM
35
5
0
14 Mar 2024
Quiet-STaR: Language Models Can Teach Themselves to Think Before
  Speaking
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAG
ReLM
LRM
60
121
0
14 Mar 2024
Laying the Foundation First? Investigating the Generalization from
  Atomic Skills to Complex Reasoning Tasks
Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks
Yuncheng Huang
Qi He
Yipei Xu
Jiaqing Liang
Yanghua Xiao
LRM
46
1
0
14 Mar 2024
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Zhiqing Sun
Longhui Yu
Yikang Shen
Weiyang Liu
Yiming Yang
Sean Welleck
Chuang Gan
36
58
0
14 Mar 2024
Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge
  in Datasets and Large Language Models
Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge in Datasets and Large Language Models
Zhuoqun Li
Hongyu Lin
Yaojie Lu
Hao Xiang
Xianpei Han
Le Sun
41
1
0
14 Mar 2024
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain
  Reinforcement Learning From AI Feedback
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback
Ang Li
Qiugen Xiao
Peng Cao
Jian Tang
Yi Yuan
...
Weidong Guo
Yukang Gan
Jeffrey Xu Yu
D. Wang
Ying Shan
VLM
ALM
44
10
0
13 Mar 2024
Gemma: Open Models Based on Gemini Research and Technology
Gemma: Open Models Based on Gemini Research and Technology
Gemma Team
Gemma Team Thomas Mesnard
Cassidy Hardin
Robert Dadashi
Surya Bhupatiraju
...
Armand Joulin
Noah Fiedel
Evan Senter
Alek Andreev
Kathleen Kenealy
VLM
LLMAG
131
441
0
13 Mar 2024
Mastering Text, Code and Math Simultaneously via Fusing Highly
  Specialized Language Models
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Ning Ding
Yulin Chen
Ganqu Cui
Xingtai Lv
Weilin Zhao
Ruobing Xie
Bowen Zhou
Zhiyuan Liu
Maosong Sun
ALM
MoMe
AI4CE
43
7
0
13 Mar 2024
Rethinking Generative Large Language Model Evaluation for Semantic
  Comprehension
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Fangyun Wei
Xi Chen
Linzi Luo
ELM
ALM
LRM
38
7
0
12 Mar 2024
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Sainbayar Sukhbaatar
O. Yu. Golovneva
Vasu Sharma
Hu Xu
Xi Lin
...
Jacob Kahn
Shang-Wen Li
Wen-tau Yih
Jason Weston
Xian Li
MoMe
OffRL
MoE
45
62
0
12 Mar 2024
Fine-tuning Large Language Models with Sequential Instructions
Fine-tuning Large Language Models with Sequential Instructions
Hanxu Hu
Simon Yu
Pinzhen Chen
Edoardo Ponti
ALM
LRM
84
15
0
12 Mar 2024
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese
  Large Language Models
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models
Yan Liu
Renren Jin
Ling Shi
Zheng Yao
Deyi Xiong
LRM
37
4
0
12 Mar 2024
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large
  Language Models by Summarizing Training Trajectories of Small Models
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models
Yu Yang
Siddhartha Mishra
Jeffrey N Chiang
Baharan Mirzasoleiman
42
18
0
12 Mar 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
MQ
57
44
0
12 Mar 2024
$\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking
  Reinforcement Learning Algorithms in Generative Language Model
(N,K)\mathbf{(N,K)}(N,K)-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Yufeng Zhang
Liyu Chen
Boyi Liu
Yingxiang Yang
Qiwen Cui
Yunzhe Tao
Hongxia Yang
125
0
0
11 Mar 2024
The pitfalls of next-token prediction
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
42
64
0
11 Mar 2024
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis
Yanming Liu
Xinyue Peng
Tianyu Du
Jianwei Yin
Weihao Liu
Xuhong Zhang
LRM
38
16
0
11 Mar 2024
Academically intelligent LLMs are not necessarily socially intelligent
Academically intelligent LLMs are not necessarily socially intelligent
Ruoxi Xu
Hongyu Lin
Xianpei Han
Le Sun
Yingfei Sun
ELM
37
6
0
11 Mar 2024
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in
  Korean
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
Eunsu Kim
Juyoung Suk
Philhoon Oh
Haneul Yoo
James Thorne
Alice Oh
ELM
80
17
0
11 Mar 2024
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small
  Language Models
Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
Minjie Zhu
Yichen Zhu
Xin Liu
Ning Liu
Zhiyuan Xu
Yaxin Peng
Chaomin Shen
Zhicai Ou
Feifei Feng
Jian Tang
VLM
57
20
0
10 Mar 2024
Algorithmic progress in language models
Algorithmic progress in language models
Anson Ho
T. Besiroglu
Ege Erdil
David Owen
Robi Rahman
Zifan Carl Guo
David Atkinson
Neil Thompson
J. Sevilla
42
16
0
09 Mar 2024
Decoding the AI Pen: Techniques and Challenges in Detecting AI-Generated
  Text
Decoding the AI Pen: Techniques and Challenges in Detecting AI-Generated Text
Sara Abdali
Richard Anarfi
C. Barberan
Jia He
DeLMO
39
10
0
09 Mar 2024
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless
  Generative Inference of LLM
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Hao Kang
Qingru Zhang
Souvik Kundu
Geonhwa Jeong
Zaoxing Liu
Tushar Krishna
Tuo Zhao
MQ
49
80
0
08 Mar 2024
DeepSeek-VL: Towards Real-World Vision-Language Understanding
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Haoyu Lu
Wen Liu
Bo Zhang
Bing-Li Wang
Kai Dong
...
Yaofeng Sun
Chengqi Deng
Hanwei Xu
Zhenda Xie
Chong Ruan
VLM
41
309
0
08 Mar 2024
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in
  Long-Horizon Generation
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
Zihao Wang
Hoang Trung-Dung
Haowei Lin
Jiaqi Li
Xiaojian Ma
Yitao Liang
ReLM
RALM
LRM
102
48
0
08 Mar 2024
How Far Are We from Intelligent Visual Deductive Reasoning?
How Far Are We from Intelligent Visual Deductive Reasoning?
Yizhe Zhang
Richard He Bai
Ruixiang Zhang
Jiatao Gu
Shuangfei Zhai
J. Susskind
Navdeep Jaitly
ReLM
LRM
54
14
0
07 Mar 2024
Common 7B Language Models Already Possess Strong Math Capabilities
Common 7B Language Models Already Possess Strong Math Capabilities
Chen Li
Weiqi Wang
Jingcheng Hu
Yixuan Wei
Nanning Zheng
Han Hu
Zheng Zhang
Houwen Peng
ALM
LRM
45
78
0
07 Mar 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLM
LRM
150
514
0
07 Mar 2024
Teaching Large Language Models to Reason with Reinforcement Learning
Teaching Large Language Models to Reason with Reinforcement Learning
Alex Havrilla
Yuqing Du
Sharath Chandra Raparthy
Christoforos Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Sainbayar Sukhbaatar
Roberta Raileanu
ReLM
LRM
39
73
0
07 Mar 2024
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Wei-Lin Chiang
Lianmin Zheng
Ying Sheng
Anastasios Nikolas Angelopoulos
Tianle Li
...
Hao Zhang
Banghua Zhu
Michael I. Jordan
Joseph E. Gonzalez
Ion Stoica
OSLM
23
503
0
07 Mar 2024
Can Large Language Models do Analytical Reasoning?
Can Large Language Models do Analytical Reasoning?
Yebowen Hu
Kaiqiang Song
Sangwoo Cho
Xiaoyang Wang
H. Foroosh
Dong Yu
Fei Liu
ELM
ReLM
LRM
27
2
0
06 Mar 2024
Learning to Decode Collaboratively with Multiple Language Models
Learning to Decode Collaboratively with Multiple Language Models
Zejiang Shen
Hunter Lang
Bailin Wang
Yoon Kim
David Sontag
56
29
0
06 Mar 2024
ShortGPT: Layers in Large Language Models are More Redundant Than You
  Expect
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men
Mingyu Xu
Qingyu Zhang
Bingning Wang
Hongyu Lin
Yaojie Lu
Xianpei Han
Weipeng Chen
38
107
0
06 Mar 2024
Benchmarking Hallucination in Large Language Models based on
  Unanswerable Math Word Problem
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
Yuhong Sun
Zhangyue Yin
Qipeng Guo
Jiawen Wu
Xipeng Qiu
Hui Zhao
41
14
0
06 Mar 2024
Should We Fear Large Language Models? A Structural Analysis of the Human
  Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens
  of Heidegger's Philosophy
Should We Fear Large Language Models? A Structural Analysis of the Human Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens of Heidegger's Philosophy
Jianqiiu Zhang
ELM
40
1
0
05 Mar 2024
PARADISE: Evaluating Implicit Planning Skills of Language Models with
  Procedural Warnings and Tips Dataset
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset
Arda Uzunouglu
Abdalfatah Rashid Safa
Gözde Gül Sahin
LRM
30
2
0
05 Mar 2024
Previous
123...414243...616263
Next