Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.14168
Cited By
Training Verifiers to Solve Math Word Problems
27 October 2021
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
Lukasz Kaiser
Matthias Plappert
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training Verifiers to Solve Math Word Problems"
50 / 3,115 papers shown
Title
Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Che Zhang
Zhenyang Xiao
Chengcheng Han
Yixin Lian
Yuejian Fang
LRM
33
0
0
20 Feb 2024
Code Needs Comments: Enhancing Code LLMs with Comment Augmentation
Demin Song
Honglin Guo
Yunhua Zhou
Shuhao Xing
Yudong Wang
...
Wenwei Zhang
Qipeng Guo
Hang Yan
Xipeng Qiu
Dahua Lin
SyDa
65
8
0
20 Feb 2024
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Zhiyuan Li
Hong Liu
Denny Zhou
Tengyu Ma
LRM
AI4CE
30
101
0
20 Feb 2024
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models
Tongxu Luo
Jiahe Lei
Fangyu Lei
Weihao Liu
Shizhu He
Jun Zhao
Kang Liu
MoE
ALM
40
19
0
20 Feb 2024
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic
Fajri Koto
Haonan Li
Sara Shatnawi
Jad Doughman
Abdelrahman Boda Sadallah
...
Neha Sengupta
Shady Shehata
Nizar Habash
Preslav Nakov
Timothy Baldwin
ELM
LRM
85
31
0
20 Feb 2024
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
MoE
43
10
0
20 Feb 2024
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
Xiao Li
Bolin Zhu
Kaiwen Shi
Sichen Liu
Yin Zhu
Yiwei Liu
Gong Cheng
AIMat
45
0
0
20 Feb 2024
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models
Loka Li
Zhenhao Chen
Guan-Hong Chen
Yixuan Zhang
Yusheng Su
Eric P. Xing
Kun Zhang
LRM
46
16
0
19 Feb 2024
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Mosh Levy
Alon Jacoby
Yoav Goldberg
50
70
0
19 Feb 2024
CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation
Jueon Eom
Seyeon Jeong
Taekyoung Kwon
34
7
0
19 Feb 2024
Reformatted Alignment
Run-Ze Fan
Xuefeng Li
Haoyang Zou
Junlong Li
Shwai He
Ethan Chern
Jiewen Hu
Pengfei Liu
65
8
0
19 Feb 2024
Can LLMs Compute with Reasons?
Harshit Sandilya
Peehu Raj
J. Bafna
Srija Mukhopadhyay
Shivansh Sharma
Ellwil Sharma
Arastu Sharma
Neeta Trivedi
Manish Shrivastava
Rajesh Kumar
LRM
35
0
0
19 Feb 2024
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task
Jannik Brinkmann
Abhay Sheshadri
Victor Levoso
Paul Swoboda
Christian Bartelt
LRM
37
24
0
19 Feb 2024
SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning
Zhihao Wen
Jie Zhang
Yuan Fang
MoE
39
3
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
58
2
0
19 Feb 2024
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
Rishabh Bhardwaj
Do Duc Anh
Soujanya Poria
MoMe
50
38
0
19 Feb 2024
Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents
Renxi Wang
Haonan Li
Xudong Han
Yixuan Zhang
Timothy Baldwin
LLMAG
27
22
0
18 Feb 2024
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
Guijin Son
Sangwon Baek
Sangdae Nam
Ilgyun Jeong
Seungone Kim
ELM
LRM
42
14
0
18 Feb 2024
KMMLU: Measuring Massive Multitask Language Understanding in Korean
Guijin Son
Hanwool Albert Lee
Sungdong Kim
Seungone Kim
Niklas Muennighoff
Taekyoon Choi
Cheonbok Park
Kang Min Yoo
Stella Biderman
ALM
RALM
ELM
60
38
0
18 Feb 2024
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
Zhaorun Chen
Zhuokai Zhao
Zhihong Zhu
Ruiqi Zhang
Xiang Li
Bhiksha Raj
Huaxiu Yao
LRM
30
25
0
18 Feb 2024
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation
Siyuan Wang
Zhuohan Long
Zhihao Fan
Zhongyu Wei
Xuanjing Huang
LLMAG
26
28
0
18 Feb 2024
CliqueParcel: An Approach For Batching LLM Prompts That Jointly Optimizes Efficiency And Faithfulness
Jiayi Liu
Tinghan Yang
Jennifer Neville
26
10
0
17 Feb 2024
Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering
Pragya Srivastava
Manuj Malik
Vivek Gupta
T. Ganu
Dan Roth
25
15
0
17 Feb 2024
I Learn Better If You Speak My Language: Understanding the Superior Performance of Fine-Tuning Large Language Models with LLM-Generated Responses
Xuan Ren
Biao Wu
Lingqiao Liu
36
6
0
17 Feb 2024
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Arindam Mitra
Hamed Khanpour
Corby Rosset
Ahmed Hassan Awadallah
ALM
MoE
LRM
40
65
0
16 Feb 2024
Language Models as Science Tutors
Alexis Chevalier
Jiayi Geng
Alexander Wettig
Howard Chen
Sebastian Mizera
...
Jiatong Yu
Jun-Jie Zhu
Z. Ren
Sanjeev Arora
Danqi Chen
ELM
35
11
0
16 Feb 2024
II-MMR: Identifying and Improving Multi-modal Multi-hop Reasoning in Visual Question Answering
Jihyung Kil
Farideh Tavazoee
Dongyeop Kang
Joo-Kyung Kim
LRM
41
2
0
16 Feb 2024
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator
Ziru Chen
Michael White
Raymond Mooney
Ali Payani
Yu-Chuan Su
Huan Sun
LLMAG
80
32
0
16 Feb 2024
Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes
Dingzirui Wang
Longxu Dou
Xuanliang Zhang
Qingfu Zhu
Wanxiang Che
LRM
39
0
0
16 Feb 2024
Can Separators Improve Chain-of-Thought Prompting?
Yoonjeong Park
Hyunjin Kim
Chanyeol Choi
Junseong Kim
Jy-yong Sohn
LRM
ReLM
26
2
0
16 Feb 2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Dayou Du
Yijia Zhang
Shijie Cao
Jiaqi Guo
Ting Cao
Xuming Hu
Ningyi Xu
MQ
46
30
0
16 Feb 2024
Large Language Models as Zero-shot Dialogue State Tracker through Function Calling
Zekun Li
Zhiyu Zoey Chen
Mike Ross
Patrick Huber
Seungwhan Moon
Zhaojiang Lin
Xin Luna Dong
Adithya Sagar
Xifeng Yan
Paul A. Crook
46
22
0
16 Feb 2024
QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning
Hossein Rajabzadeh
Mojtaba Valipour
Tianshu Zhu
Marzieh S. Tahaei
Hyock Ju Kwon
Ali Ghodsi
Boxing Chen
Mehdi Rezagholizadeh
32
9
0
16 Feb 2024
Can We Verify Step by Step for Incorrect Answer Detection?
Xin Xu
Shizhe Diao
Can Yang
Yang Wang
LRM
130
14
0
16 Feb 2024
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs
Yebowen Hu
Kaiqiang Song
Sangwoo Cho
Xiaoyang Wang
H. Foroosh
Dong Yu
Fei Liu
30
8
0
15 Feb 2024
Chain-of-Thought Reasoning Without Prompting
Xuezhi Wang
Denny Zhou
ReLM
LRM
152
104
0
15 Feb 2024
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Lingbo Mo
Zeyi Liao
Boyuan Zheng
Yu-Chuan Su
Chaowei Xiao
Huan Sun
AAML
LLMAG
51
15
0
15 Feb 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
James Liu
Guangxuan Xiao
Kai Li
Jason D. Lee
Song Han
Tri Dao
Tianle Cai
45
21
0
15 Feb 2024
Reward Generalization in RLHF: A Topological Perspective
Tianyi Qiu
Fanzhi Zeng
Jiaming Ji
Dong Yan
Kaile Wang
Jiayi Zhou
Yang Han
Josef Dai
Xuehai Pan
Yaodong Yang
AI4CE
37
4
0
15 Feb 2024
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Shubham Toshniwal
Ivan Moshkov
Sean Narenthiran
Daria Gitman
Fei Jia
Igor Gitman
36
79
0
15 Feb 2024
How to Train Data-Efficient LLMs
Noveen Sachdeva
Benjamin Coleman
Wang-Cheng Kang
Jianmo Ni
Lichan Hong
Ed H. Chi
James Caverlee
Julian McAuley
D. Cheng
34
52
0
15 Feb 2024
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
Siwei Yang
Bingchen Zhao
Cihang Xie
LRM
19
6
0
14 Feb 2024
AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails
Sankalan Pal Chowdhury
Vilém Zouhar
Mrinmaya Sachan
AI4Ed
LRM
29
14
0
14 Feb 2024
Instruction Backdoor Attacks Against Customized LLMs
Rui Zhang
Hongwei Li
Rui Wen
Wenbo Jiang
Yuan Zhang
Michael Backes
Yun Shen
Yang Zhang
AAML
SILM
35
25
0
14 Feb 2024
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning
Yejie Wang
Keqing He
Guanting Dong
Pei Wang
Weihao Zeng
...
Yutao Mou
Mengdi Zhang
Jingang Wang
Xunliang Cai
Weiran Xu
ALM
33
10
0
14 Feb 2024
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
Yinya Huang
Xiaohan Lin
Zhengying Liu
Qingxing Cao
Huajian Xin
Haiming Wang
Zhenguo Li
Linqi Song
Xiaodan Liang
ALM
43
35
0
14 Feb 2024
Premise Order Matters in Reasoning with Large Language Models
Xinyun Chen
Ryan A. Chi
Xuezhi Wang
Denny Zhou
ReLM
LRM
49
31
0
14 Feb 2024
Reinforcement Learning from Human Feedback with Active Queries
Kaixuan Ji
Jiafan He
Quanquan Gu
29
17
0
14 Feb 2024
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alex Havrilla
Sharath Raparthy
Christoforus Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Roberta Railneau
ReLM
LRM
41
51
0
13 Feb 2024
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
Haotian Sun
Yuchen Zhuang
Wei Wei
Chao Zhang
Bo Dai
27
3
0
13 Feb 2024
Previous
1
2
3
...
44
45
46
...
61
62
63
Next