Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.14168
Cited By
Training Verifiers to Solve Math Word Problems
27 October 2021
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
Lukasz Kaiser
Matthias Plappert
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training Verifiers to Solve Math Word Problems"
50 / 2,994 papers shown
Title
ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search
Yuyao Zhang
Tianshu Wang
Sirui Chen
Kun Wang
Xingyu Zeng
Hongyu Lin
Xianpei Han
Le Sun
Chaochao Lu
LRM
31
0
0
15 Apr 2025
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai
Sile Hu
Xu Shen
Yonggang Zhang
Xinmei Tian
Jieping Ye
MoMe
51
2
0
15 Apr 2025
A Dual-Space Framework for General Knowledge Distillation of Large Language Models
Xuzhi Zhang
Songming Zhang
Yunlong Liang
Fandong Meng
Yufeng Chen
Jinan Xu
Jie Zhou
26
0
0
15 Apr 2025
Teaching Large Language Models to Reason through Learning and Forgetting
Tianwei Ni
Allen Nie
Sapana Chaudhary
Yao Liu
Huzefa Rangwala
Rasool Fakoor
ReLM
CLL
LRM
142
0
0
15 Apr 2025
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Ian H. Magnusson
Nguyen Tai
Ben Bogin
David Heineman
Jena D. Hwang
...
Dirk Groeneveld
Oyvind Tafjord
Noah A. Smith
Pang Wei Koh
Jesse Dodge
ALM
37
0
0
15 Apr 2025
Dynamic Compressing Prompts for Efficient Inference of Large Language Models
Jinwu Hu
W. Zhang
Yufeng Wang
Yu Hu
Bin Xiao
Mingkui Tan
Qing Du
31
1
0
15 Apr 2025
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
148
2
0
15 Apr 2025
Weight Ensembling Improves Reasoning in Language Models
Xingyu Dang
Christina Baek
Kaiyue Wen
Zico Kolter
Aditi Raghunathan
MoMe
LRM
63
1
0
14 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
70
15
1
14 Apr 2025
HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving
Avinash Kumar
Shashank Nag
Jason Clemons
L. John
Poulami Das
31
0
0
14 Apr 2025
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
Ming Li
Yongqian Li
Ziyue Li
Tianyi Zhou
LRM
27
1
0
14 Apr 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
Aryan Shrivastava
Paula Akemi Aoyagui
29
0
0
14 Apr 2025
Training Small Reasoning LLMs with Cognitive Preference Alignment
Wenrui Cai
Chengyu Wang
Junbing Yan
Jun Huang
Xiangzhong Fang
LRM
26
1
0
14 Apr 2025
The Jailbreak Tax: How Useful are Your Jailbreak Outputs?
Kristina Nikolić
Luze Sun
Jie Zhang
F. Tramèr
28
0
0
14 Apr 2025
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
Soham Shah
Kumar Shridhar
Surojit Chatterjee
Souvik Sen
34
0
0
14 Apr 2025
Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems
Zaid Khan
Elias Stengel-Eskin
Archiki Prasad
Jaemin Cho
Joey Tianyi Zhou
31
0
0
14 Apr 2025
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Junlei Zhang
Zichen Ding
Chang Ma
Zijie Chen
Qiushi Sun
Zhenzhong Lan
Junxian He
138
0
0
14 Apr 2025
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Junxiong Wang
Wen-Ding Li
Daniele Paliotta
Daniel Ritter
Alexander M. Rush
Tri Dao
LRM
36
0
0
14 Apr 2025
Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability
Haotian Wang
Han Zhao
Shuaiting Chen
Xiaoyu Tian
Sitong Zhao
Yunjie Ji
Yiping Peng
Xiangang Li
ReLM
LRM
54
0
0
13 Apr 2025
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Siddhant Gupta
Drishti Sharma
Jebish Purbey
Kanwal Mehreen
Muhammad Arham
Hamza Farooq
38
0
0
13 Apr 2025
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance
Zuoli Tang
Junjie Ou
Kaiqin Hu
Chunwei Wu
Zhaoxin Huan
Chilin Fu
Xiaolu Zhang
Jun Zhou
Chenliang Li
ReLM
LRM
43
0
0
13 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELM
LM&MA
60
0
0
13 Apr 2025
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution
Chenghao Li
Chaoning Zhang
Yi Lu
J. Zhang
Qigan Sun
X. Wang
Jiwei Wei
Guoqing Wang
Yang Yang
H. Shen
LRM
68
1
0
13 Apr 2025
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Kang Yang
Guanhong Tao
X. Chen
Jun Xu
36
0
0
13 Apr 2025
Question Tokens Deserve More Attention: Enhancing Large Language Models without Training through Step-by-Step Reading and Question Attention Recalibration
Feijiang Han
Licheng Guo
Hengtao Cui
Zhiyuan Lyu
LRM
36
0
0
13 Apr 2025
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
Chengyu Wang
Taolin Zhang
Richang Hong
Jun Huang
ReLM
LRM
45
1
0
12 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
45
2
0
12 Apr 2025
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
Xin Gao
Qizhi Pei
Zinan Tang
Yongqian Li
Honglin Lin
Jiang Wu
C. He
Lijun Wu
SyDa
33
0
0
11 Apr 2025
SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting
Jiaming Xu
Jiayi Pan
Yongkang Zhou
Siming Chen
J. Li
Yaoxiu Lian
Junyi Wu
Guohao Dai
LRM
37
0
0
11 Apr 2025
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
FangZhi Xu
Hang Yan
Chang Ma
Haiteng Zhao
Qiushi Sun
Kanzhi Cheng
Junxian He
Jun Liu
Zhiyong Wu
LRM
31
1
0
11 Apr 2025
Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries
Neil He
Jiahong Liu
Buze Zhang
N. Bui
Ali Maatouk
Menglin Yang
Irwin King
Melanie Weber
Rex Ying
29
0
0
11 Apr 2025
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Yixin Cao
Jiahao Ying
Yibo Wang
Xipeng Qiu
Xuanjing Huang
Yugang Jiang
ELM
44
2
0
10 Apr 2025
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Rosie Zhao
Alexandru Meterez
Sham Kakade
C. Pehlevan
Samy Jelassi
Eran Malach
ReLM
LRM
114
2
0
10 Apr 2025
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Junyuan Zhang
Rushuai Yang
Shunyu Liu
Ting-En Lin
Fei Huang
Yi Chen
Yongqian Li
Dacheng Tao
OffRL
20
0
0
10 Apr 2025
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models
Hongcheng Guo
Juntao Yao
Boyang Wang
Junjia Du
Shaosheng Cao
Donglin Di
Shun Zhang
Zehan Li
MoE
40
0
0
10 Apr 2025
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
Tuhin Chakrabarty
Philippe Laban
C. Wu
32
1
0
10 Apr 2025
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
Juzheng Zhang
Jiacheng You
Ashwinee Panda
Tom Goldstein
MoMe
53
1
0
10 Apr 2025
SD
2
^2
2
: Self-Distilled Sparse Drafters
Mike Lasby
Nish Sinnadurai
Valavan Manohararajah
Sean Lie
Vithursan Thangarasa
143
1
0
10 Apr 2025
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Hanqi Xiao
Yi-Lin Sung
Elias Stengel-Eskin
Joey Tianyi Zhou
MQ
38
0
0
10 Apr 2025
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
Pavel Chizhov
Mattia Nee
Pierre-Carl Langlais
Ivan P. Yamshchikov
ReLM
ELM
LRM
44
1
0
10 Apr 2025
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
Yangning Li
Zihua Lan
Lv Qingsong
Hai-Tao Zheng
Hai-Tao Zheng
31
0
0
09 Apr 2025
ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning
Zijian Wang
Chang Xu
LRM
30
1
0
09 Apr 2025
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning
Atharva Pandey
Kshitij Dubey
Rahul Sharma
Amit Sharma
ReLM
ELM
LRM
52
0
0
09 Apr 2025
RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models
Lv Qingsong
Yangning Li
Zihua Lan
Zishan Xu
Jiwei Tang
Hai-Tao Zheng
Wenhao Jiang
Hai-tao Zheng
Philip S. Yu
32
0
0
09 Apr 2025
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Chenrui Fan
Ming Li
Lichao Sun
Tianyi Zhou
LRM
51
3
0
09 Apr 2025
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Gleb Rodionov
Roman Garipov
Alina Shutova
George Yakushev
Vage Egiazarian
Anton Sinitsin
Denis Kuznedelev
Dan Alistarh
LRM
32
2
0
08 Apr 2025
Knowledge-Instruct: Effective Continual Pre-training from Limited Data using Instructions
O. Ovadia
Meni Brief
Rachel Lemberg
Eitam Sheetrit
CLL
KELM
47
0
0
08 Apr 2025
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
Pengfei Zhou
Fanrui Zhang
Xiaopeng Peng
Zhaopan Xu
Jiaxin Ai
...
Kai Wang
Xiaojun Chang
Wenqi Shao
Yang You
Kaipeng Zhang
ELM
LRM
32
0
0
08 Apr 2025
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following
Sai Adith Senthil Kumar
Hao Yan
Saipavan Perepa
Murong Yue
Ziyu Yao
62
0
0
08 Apr 2025
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Hossein Entezari Zarch
Lei Gao
Chaoyi Jiang
Murali Annavaram
LRM
31
0
0
08 Apr 2025
Previous
1
2
3
4
5
...
58
59
60
Next