Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.14168
Cited By
Training Verifiers to Solve Math Word Problems
27 October 2021
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
Lukasz Kaiser
Matthias Plappert
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training Verifiers to Solve Math Word Problems"
50 / 3,115 papers shown
Title
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models
Haoran Sun
Lixin Liu
Junjie Li
Fengyu Wang
Baohua Dong
Ran Lin
Ruohui Huang
33
16
0
03 Apr 2024
Automatic Prompt Selection for Large Language Models
Viet-Tung Do
Van-Khanh Hoang
Duy-Hung Nguyen
Shahab Sabahi
Jeff Yang
Hajime Hotta
Minh-Tien Nguyen
Hung Le
40
6
0
03 Apr 2024
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Hyungjoo Chae
Yeonghyeon Kim
Seungone Kim
Kai Tzu-iunn Ong
Beong-woo Kwak
...
Seonghwan Kim
Taeyoon Kwon
Jiwan Chung
Youngjae Yu
Jinyoung Yeo
LRM
ReLM
45
14
0
03 Apr 2024
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng
Zhaohui Wang
Muhan Zhang
VLM
64
79
0
03 Apr 2024
LM
2
\texttt{LM}^\texttt{2}
LM
2
: A Simple Society of Language Models Solves Complex Reasoning
Gurusha Juneja
Subhabrata Dutta
Tanmoy Chakraborty
ReLM
LRM
40
2
0
02 Apr 2024
Advancing LLM Reasoning Generalists with Preference Trees
Lifan Yuan
Ganqu Cui
Hanbin Wang
Ning Ding
Xingyao Wang
...
Zhenghao Liu
Bowen Zhou
Hao Peng
Zhiyuan Liu
Maosong Sun
LRM
50
101
0
02 Apr 2024
Risks from Language Models for Automated Mental Healthcare: Ethics and Structure for Implementation
Declan Grabb
Max Lamparth
N. Vasan
53
15
0
02 Apr 2024
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
45
8
0
02 Apr 2024
Poro 34B and the Blessing of Multilinguality
Risto Luukkonen
Jonathan Burdge
Elaine Zosa
Aarne Talman
Ville Komulainen
Vaino Hatanpaa
Peter Sarlin
S. Pyysalo
AI4CE
55
12
0
02 Apr 2024
PATCH -- Psychometrics-AssisTed benCHmarking of Large Language Models: A Case Study of Mathematics Proficiency
Qixiang Fang
Daniel L. Oberski
Dong Nguyen
51
3
0
02 Apr 2024
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Chen Yang
Junzhuo Li
Xinyao Niu
Xinrun Du
Songyang Gao
...
Stephen W. Huang
Shawn Yue
Wenhu Chen
Jie Fu
Ge Zhang
43
2
0
01 Apr 2024
Exploring the Mystery of Influential Data for Mathematical Reasoning
Xinzhe Ni
Yeyun Gong
Zhibin Gou
Yelong Shen
Yujiu Yang
Nan Duan
Weizhu Chen
47
9
0
01 Apr 2024
Can LLMs get help from other LLMs without revealing private information?
Florian Hartmann
D. Tran
Peter Kairouz
Victor Carbune
Blaise Agüera y Arcas
33
6
0
01 Apr 2024
Evalverse: Unified and Accessible Library for Large Language Model Evaluation
Jihoo Kim
Wonho Song
Dahyun Kim
Yunsu Kim
Yungi Kim
Chanjun Park
ELM
76
3
0
01 Apr 2024
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models
Wei He
Shichun Liu
Jun Zhao
Yiwen Ding
Yi Lu
Zhiheng Xi
Tao Gui
Qi Zhang
Xuanjing Huang
56
1
0
01 Apr 2024
Bailong: Bilingual Transfer Learning based on QLoRA and Zip-tie Embedding
Lung-Chuan Chen
Zong-Ru Li
ALM
45
0
0
01 Apr 2024
The Larger the Better? Improved LLM Code-Generation via Budget Reallocation
Michael Hassid
Tal Remez
Jonas Gehring
Roy Schwartz
Yossi Adi
41
20
0
31 Mar 2024
Learning to Plan for Language Modeling from Unlabeled Data
Nathan Cornille
Marie-Francine Moens
Florian Mai
38
7
0
31 Mar 2024
Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
Xiao Liu
Xixuan Song
Yuxiao Dong
Jie Tang
SyDa
36
5
0
31 Mar 2024
PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression
Muhammad Asif Ali
Zhengping Li
Shu Yang
Keyuan Cheng
Yang Cao
Tianhao Huang
Lijie Hu
Lu Yu
Di Wang
VLM
RALM
60
9
0
30 Mar 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Taishi Nakamura
Mayank Mishra
Simone Tedeschi
Yekun Chai
Jason T Stillerman
...
Virendra Mehta
Matthew Blumberg
Victor May
Huu Nguyen
S. Pyysalo
LRM
53
7
0
30 Mar 2024
Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange
Ankit Satpute
Noah Giessing
André Greiner-Petter
M. Schubotz
O. Teschke
Akiko Aizawa
Bela Gipp
ELM
LRM
44
21
0
30 Mar 2024
Measuring Taiwanese Mandarin Language Understanding
Po-Heng Chen
Sijia Cheng
Wei-Lin Chen
Yen-Ting Lin
Yun-Nung Chen
ELM
57
2
0
29 Mar 2024
Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models
Jesse Atuhurra
Iqra Ali
Tatsuya Hiraoka
Hidetaka Kamigaito
Tomoya Iwakura
Taro Watanabe
51
1
0
29 Mar 2024
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning
Yongqi Tong
Dawei Li
Sizhe Wang
Yujia Wang
Fei Teng
Jingbo Shang
LRM
39
49
0
29 Mar 2024
MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models
Peng Ding
Jiading Fang
Peng Li
Kangrui Wang
Xiaochen Zhou
Mo Yu
Jing Li
Matthew R. Walter
Hongyuan Mei
RALM
ELM
55
6
0
29 Mar 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
39
211
0
28 Mar 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
52
16
0
28 Mar 2024
sDPO: Don't Use Your Data All at Once
Dahyun Kim
Yungi Kim
Wonho Song
Hyeonwoo Kim
Yunsu Kim
Sanghoon Kim
Chanjun Park
36
31
0
28 Mar 2024
Mitigating Misleading Chain-of-Thought Reasoning with Selective Filtering
Yexin Wu
Zhuosheng Zhang
Hai Zhao
LRM
27
4
0
28 Mar 2024
Learning From Correctness Without Prompting Makes LLM Efficient Reasoner
Yuxuan Yao
Han Wu
Zhijiang Guo
Biyan Zhou
Jiahui Gao
Sichun Luo
Hanxu Hou
Xiaojin Fu
Linqi Song
LLMAG
LRM
50
9
0
28 Mar 2024
Large Language Models Are Struggle to Cope with Unreasonability in Math Problems
Jingyuan Ma
Damai Dai
Zihang Yuan
Rui Li
Weilin Luo
Bin Wang
Qun Liu
Lei Sha
Zhifang Sui
LRM
101
4
0
28 Mar 2024
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian
Andrea Esuli
Giovanni Puccetti
ELM
35
0
0
27 Mar 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
44
35
0
27 Mar 2024
Dual Instruction Tuning with Large Language Models for Mathematical Reasoning
Yongwei Zhou
Tiejun Zhao
LRM
30
7
0
27 Mar 2024
Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation
Chuwen Wang
Shirong Zeng
Cheng Wang
LLMAG
LRM
32
2
0
27 Mar 2024
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
Jin Peng Zhou
Charles Staats
Wenda Li
Christian Szegedy
Kilian Q. Weinberger
Yuhuai Wu
LRM
34
29
0
26 Mar 2024
Large Language Models for Education: A Survey and Outlook
Shen Wang
Tianlong Xu
Hang Li
Chaoli Zhang
Joleen Liang
Jiliang Tang
Philip S. Yu
Qingsong Wen
AI4Ed
54
99
0
26 Mar 2024
COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning
Yuelin Bai
Xinrun Du
Yiming Liang
Yonggang Jin
Ziqiang Liu
...
Chenghua Lin
Jie Fu
Min Yang
Shiwen Ni
Ge Zhang
ALM
48
33
0
26 Mar 2024
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
Rui Pan
Xiang Liu
Shizhe Diao
Renjie Pi
Jipeng Zhang
Chi Han
Tong Zhang
48
38
0
26 Mar 2024
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models
Jinyi Li
Yihuai Lan
Lei Wang
Hao Wang
35
0
0
26 Mar 2024
ChatGPT Rates Natural Language Explanation Quality Like Humans: But on Which Scales?
Fan Huang
Haewoon Kwak
Kunwoo Park
Jisun An
ALM
ELM
AI4MH
45
12
0
26 Mar 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
59
85
0
26 Mar 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
57
9
0
25 Mar 2024
Leveraging Zero-Shot Prompting for Efficient Language Model Distillation
Lukas Vöge
Vincent Gurgul
Stefan Lessmann
32
0
0
23 Mar 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCV
LRM
73
48
0
23 Mar 2024
Can large language models explore in-context?
Akshay Krishnamurthy
Keegan Harris
Dylan J. Foster
Cyril Zhang
Aleksandrs Slivkins
LM&Ro
LLMAG
LRM
136
24
0
22 Mar 2024
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A Multifaceted Statistical Approach
Kun Sun
Rong Wang
Anders Sogaard
37
3
0
22 Mar 2024
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Nicholas Lee
Thanakul Wattanawong
Sehoon Kim
K. Mangalam
Sheng Shen
Gopala Anumanchipalli
Michael W. Mahoney
Kurt Keutzer
A. Gholami
69
46
0
22 Mar 2024
A Picture Is Worth a Graph: Blueprint Debate on Graph for Multimodal Reasoning
Changmeng Zheng
Dayong Liang
Wengyu Zhang
Xiao Wei
Tat-Seng Chua
Qing Li
43
3
0
22 Mar 2024
Previous
1
2
3
...
40
41
42
...
61
62
63
Next