ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.14168
  4. Cited By
Training Verifiers to Solve Math Word Problems

Training Verifiers to Solve Math Word Problems

27 October 2021
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
Lukasz Kaiser
Matthias Plappert
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
    ReLM
    OffRL
    LRM
ArXivPDFHTML

Papers citing "Training Verifiers to Solve Math Word Problems"

50 / 3,031 papers shown
Title
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Junxiong Wang
Wen-Ding Li
Daniele Paliotta
Daniel Ritter
Alexander M. Rush
Tri Dao
LRM
36
0
0
14 Apr 2025
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
Soham Shah
Kumar Shridhar
Surojit Chatterjee
Souvik Sen
34
0
0
14 Apr 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
Aryan Shrivastava
Paula Akemi Aoyagui
29
0
0
14 Apr 2025
HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving
HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving
Avinash Kumar
Shashank Nag
Jason Clemons
L. John
Poulami Das
31
0
0
14 Apr 2025
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
Ming Li
Yong Li
Ziyue Li
Tianyi Zhou
LRM
27
1
0
14 Apr 2025
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution
Chenghao Li
Chaoning Zhang
Yi Lu
J. Zhang
Qigan Sun
X. Wang
Jiwei Wei
Guoqing Wang
Yang Yang
H. Shen
LRM
68
1
0
13 Apr 2025
Question Tokens Deserve More Attention: Enhancing Large Language Models without Training through Step-by-Step Reading and Question Attention Recalibration
Question Tokens Deserve More Attention: Enhancing Large Language Models without Training through Step-by-Step Reading and Question Attention Recalibration
Feijiang Han
Licheng Guo
Hengtao Cui
Zhiyuan Lyu
LRM
36
0
0
13 Apr 2025
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Kang Yang
Guanhong Tao
X. Chen
Jun Xu
36
0
0
13 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELM
LM&MA
60
0
0
13 Apr 2025
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance
Zuoli Tang
Junjie Ou
Kaiqin Hu
Chunwei Wu
Zhaoxin Huan
Chilin Fu
Xiaolu Zhang
Jun Zhou
Chenliang Li
ReLM
LRM
43
0
0
13 Apr 2025
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Siddhant Gupta
Drishti Sharma
Jebish Purbey
Kanwal Mehreen
Muhammad Arham
Hamza Farooq
38
0
0
13 Apr 2025
Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability
Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability
Haotian Wang
Han Zhao
Shuaiting Chen
Xiaoyu Tian
Sitong Zhao
Yunjie Ji
Yiping Peng
Xiangang Li
ReLM
LRM
54
0
0
13 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
45
2
0
12 Apr 2025
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
Chengyu Wang
Taolin Zhang
Richang Hong
Jun Huang
ReLM
LRM
45
1
0
12 Apr 2025
Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries
Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries
Neil He
Jiahong Liu
Buze Zhang
N. Bui
Ali Maatouk
Menglin Yang
Irwin King
Melanie Weber
Rex Ying
29
0
0
11 Apr 2025
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
FangZhi Xu
Hang Yan
Chang Ma
Haiteng Zhao
Qiushi Sun
Kanzhi Cheng
Junxian He
Jun Liu
Zhiyong Wu
LRM
34
1
0
11 Apr 2025
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
Xin Gao
Qizhi Pei
Zinan Tang
Yong Li
Honglin Lin
Jiang Wu
C. He
Lijun Wu
SyDa
33
0
0
11 Apr 2025
SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting
SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting
Jiaming Xu
Jiayi Pan
Yongkang Zhou
Siming Chen
Jiajian Li
Yaoxiu Lian
Junyi Wu
Guohao Dai
LRM
37
0
0
11 Apr 2025
SD$^2$: Self-Distilled Sparse Drafters
SD2^22: Self-Distilled Sparse Drafters
Mike Lasby
Nish Sinnadurai
Valavan Manohararajah
Sean Lie
Vithursan Thangarasa
160
1
0
10 Apr 2025
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models
Hongcheng Guo
Juntao Yao
Boyang Wang
Junjia Du
Shaosheng Cao
Donglin Di
Shun Zhang
Zehan Li
MoE
40
0
0
10 Apr 2025
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
Juzheng Zhang
Jiacheng You
Ashwinee Panda
Tom Goldstein
MoMe
53
1
0
10 Apr 2025
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Jun Zhang
Rushuai Yang
Shunyu Liu
Ting-En Lin
Fei Huang
Yi Chen
Yong Li
Dacheng Tao
OffRL
26
0
0
10 Apr 2025
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Hanqi Xiao
Yi-Lin Sung
Elias Stengel-Eskin
Joey Tianyi Zhou
MQ
38
0
0
10 Apr 2025
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
Tuhin Chakrabarty
Philippe Laban
C. Wu
34
1
0
10 Apr 2025
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Rosie Zhao
Alexandru Meterez
Sham Kakade
Cengiz Pehlevan
Samy Jelassi
Eran Malach
ReLM
LRM
129
2
0
10 Apr 2025
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks
Pavel Chizhov
Mattia Nee
Pierre-Carl Langlais
Ivan P. Yamshchikov
ReLM
ELM
LRM
44
1
0
10 Apr 2025
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Yixin Cao
Jiahao Ying
Yansen Wang
Xipeng Qiu
Xuanjing Huang
Yugang Jiang
ELM
44
2
0
10 Apr 2025
RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models
RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models
Lv Qingsong
Yangning Li
Zihua Lan
Zishan Xu
Jiwei Tang
Hai-Tao Zheng
Wenhao Jiang
Wanshi Xu
Philip S. Yu
32
0
0
09 Apr 2025
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning
Yangning Li
Zihua Lan
Lv Qingsong
Hai-Tao Zheng
Hai-Tao Zheng
31
0
0
09 Apr 2025
ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning
ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning
Zijian Wang
Chang Xu
LRM
30
1
0
09 Apr 2025
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Chenrui Fan
Ming Li
Lichao Sun
Tianyi Zhou
LRM
51
3
0
09 Apr 2025
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning
Atharva Pandey
Kshitij Dubey
Rahul Sharma
Amit Sharma
ReLM
ELM
LRM
52
0
0
09 Apr 2025
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Hossein Entezari Zarch
Lei Gao
Chaoyi Jiang
Murali Annavaram
LRM
31
0
0
08 Apr 2025
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
31
0
0
08 Apr 2025
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
Pengfei Zhou
Fanrui Zhang
Xiaopeng Peng
Zhaopan Xu
Jiaxin Ai
...
Kai Wang
Xiaojun Chang
Wenqi Shao
Yang You
Kaipeng Zhang
ELM
LRM
32
0
0
08 Apr 2025
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
C. Xu
Ming-Yu Liu
P. Xu
Z. Liu
Wei Ping
M. Shoeybi
Bo Li
Bryan Catanzaro
27
1
0
08 Apr 2025
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following
Sai Adith Senthil Kumar
Hao Yan
Saipavan Perepa
Murong Yue
Ziyu Yao
62
0
0
08 Apr 2025
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
Gejian Zhao
Hanzhou Wu
Xinpeng Zhang
Athanasios V. Vasilakos
LRM
38
1
0
08 Apr 2025
Knowledge-Instruct: Effective Continual Pre-training from Limited Data using Instructions
Knowledge-Instruct: Effective Continual Pre-training from Limited Data using Instructions
O. Ovadia
Meni Brief
Rachel Lemberg
Eitam Sheetrit
CLL
KELM
47
0
0
08 Apr 2025
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Gleb Rodionov
Roman Garipov
Alina Shutova
George Yakushev
Vage Egiazarian
Anton Sinitsin
Denis Kuznedelev
Dan Alistarh
LRM
32
2
0
08 Apr 2025
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
Minki Kang
Jongwon Jeong
Jaewoong Cho
ALM
LRM
52
2
0
07 Apr 2025
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
Benjamin Lipkin
Benjamin LeBrun
Jacob Hoover Vigly
João Loula
David R. MacIver
...
Ryan Cotterell
Vikash K. Mansinghka
Timothy J. O'Donnell
Alexander K. Lew
Tim Vieira
29
0
0
07 Apr 2025
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi
Yiyang Wu
Linxin Song
Dinesh Manocha
Jieyu Zhao
LRM
83
1
0
07 Apr 2025
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
Anna Goldie
Azalia Mirhoseini
Hao Zhou
Irene Cai
Christopher D. Manning
SyDa
OffRL
ReLM
LRM
114
3
0
07 Apr 2025
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Will Cai
Tianneng Shi
Xuandong Zhao
Dawn Song
30
0
0
07 Apr 2025
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification
Anqi Zhang
Yulin Chen
Jane Pan
Chen Zhao
Aurojit Panda
Jinyang Li
He He
ReLM
LRM
44
3
0
07 Apr 2025
OrderChain: A General Prompting Paradigm to Improve Ordinal Understanding Ability of MLLM
OrderChain: A General Prompting Paradigm to Improve Ordinal Understanding Ability of MLLM
Jinhong Wang
Shuo Tong
Jian Liu
Dongqi Tang
Weiqiang Wang
Wentong Li
Hongxia Xu
Danny Chen
Jintai Chen
Jian Wu
LRM
26
0
0
07 Apr 2025
SEAL: Steerable Reasoning Calibration of Large Language Models for Free
SEAL: Steerable Reasoning Calibration of Large Language Models for Free
Runjin Chen
Zhenyu (Allen) Zhang
Junyuan Hong
Souvik Kundu
Zhangyang Wang
OffRL
LRM
52
2
0
07 Apr 2025
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models
Ruikang Liu
Yuxuan Sun
Manyi Zhang
Haoli Bai
Xianzhi Yu
Tiezheng Yu
C. Yuan
Lu Hou
MQ
LRM
39
6
0
07 Apr 2025
Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models
Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models
Yang Yan
Yu Lu
Renjun Xu
Zhenzhong Lan
LRM
36
2
0
07 Apr 2025
Previous
123456...596061
Next