Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.02336
Cited By
v1
v2
v3 (latest)
Making Large Language Models Better Reasoners with Step-Aware Verifier
6 June 2022
Yifei Li
Zeqi Lin
Shizhuo Zhang
Qiang Fu
B. Chen
Jian-Guang Lou
Weizhu Chen
ReLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Making Large Language Models Better Reasoners with Step-Aware Verifier"
50 / 53 papers shown
Title
DynScaling: Efficient Verifier-free Inference Scaling via Dynamic and Integrated Sampling
Fei Wang
Xingchen Wan
Ruoxi Sun
Jiefeng Chen
Sercan Ö. Arık
LRM
12
0
0
19 Jun 2025
SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling
Md Imbesat Hassan Rizvi
Xiaodan Zhu
Iryna Gurevych
LRM
37
0
0
18 Jun 2025
Learning to Reason Across Parallel Samples for LLM Reasoning
Jianing Qi
Xi Ye
Hao Tang
Zhigang Zhu
Eunsol Choi
ReLM
LRM
22
0
0
10 Jun 2025
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification
Tianyi Bai
Zengjie Hu
Fupeng Sun
Jiantao Qiu
Yizhen Jiang
Guangxin He
Bohan Zeng
Conghui He
Binhang Yuan
Wentao Zhang
OffRL
LRM
17
0
0
08 Jun 2025
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Parshin Shojaee
Iman Mirzadeh
Keivan Alizadeh
Maxwell Horton
Samy Bengio
Mehrdad Farajtabar
LRM
36
9
0
07 Jun 2025
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
Ruichen Zhang
Rana Muhammad Shahroz Khan
Zhen Tan
Dawei Li
Song Wang
Tianlong Chen
LRM
57
0
0
24 May 2025
T
2
^2
2
: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering
Zhengyi Zhao
Shubo Zhang
Zezhong Wang
Huimin Wang
Yutian Zhao
Bin Liang
Yefeng Zheng
Binyang Li
Kam-Fai Wong
X. Wu
LRM
89
0
0
23 May 2025
ProgRM: Build Better GUI Agents with Progress Rewards
Danyang Zhang
Situo Zhang
Ziyue Yang
Zichen Zhu
Zihan Zhao
Ruisheng Cao
Lu Chen
Kai Yu
82
0
0
23 May 2025
Real-Time Verification of Embodied Reasoning for Generative Skill Acquisition
Bo Yue
Shuqi Guo
Kaiyu Hu
Chujiao Wang
Benyou Wang
Kui Jia
Guiliang Liu
LRM
111
0
0
16 May 2025
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
Yexiang Liu
Zekun Li
Zhi Fang
Nan Xu
Ran He
Tieniu Tan
LRM
76
0
0
16 May 2025
When Reasoning Beats Scale: A 1.5B Reasoning Model Outranks 13B LLMs as Discriminator
Md Fahim Anjum
LRM
130
0
0
30 Apr 2025
Process Reward Models That Think
Muhammad Khalifa
Rishabh Agarwal
Lajanugen Logeswaran
Jaekyeom Kim
Hao Peng
Moontae Lee
Honglak Lee
Lu Wang
OffRL
ALM
LRM
143
9
0
23 Apr 2025
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan
Chen Henry Wu
Charles Ding
Aditi Raghunathan
121
0
0
21 Apr 2025
CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
Feiyang Li
Peng Fang
Zhan Shi
Arijit Khan
Fang Wang
Dan Feng
Weihao Wang
Xin Zhang
Yongjian Cui
ReLM
LRM
112
1
0
18 Apr 2025
Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark
Bingchen Miao
Y. Wu
Minghe Gao
Qifan Yu
Wendong Bu
Wenqiao Zhang
Yunfei Li
Siliang Tang
Tat-Seng Chua
Juncheng Billy Li
LLMAG
LRM
133
1
0
24 Mar 2025
Rewarding Graph Reasoning Process makes LLMs more Generalized Reasoners
Miao Peng
Nuo Chen
Zongrui Suo
Jia Li
LRM
99
1
0
02 Mar 2025
Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation
Yiwei Li
Ji Zhang
Shaoxiong Feng
Peiwen Yuan
Xinyu Wang
...
Y. Zhang
Chuyi Tan
Boyuan Pan
Yao Hu
Kan Li
HILM
152
2
0
27 Feb 2025
Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing
Juntai Cao
Xiang Zhang
Raymond Li
Chuyuan Li
Shafiq Joty
Shafiq Joty
Giuseppe Carenini
179
2
0
27 Feb 2025
DISC: DISC: Dynamic Decomposition Improves LLM Inference Scaling
Jonathan Light
Wei Cheng
Benjamin Rivière
Wu Yue
Masafumi Oyamada
Mengdi Wang
Yisong Yue
Santiago Paternain
Haifeng Chen
ReLM
LRM
129
4
0
23 Feb 2025
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Guangzhi Sun
Yudong Yang
Jimin Zhuang
Changli Tang
Yongqian Li
W. Li
Zejun Ma
Chao Zhang
LRM
MLLM
VLM
128
5
0
17 Feb 2025
A Critical Look At Tokenwise Reward-Guided Text Generation
Ahmad Rashid
Ruotian Wu
Julia Grosse
Agustinus Kristiadi
Pascal Poupart
OffRL
164
0
0
17 Feb 2025
Preference Optimization for Reasoning with Pseudo Feedback
Fangkai Jiao
Geyang Guo
Xingxing Zhang
Nancy F. Chen
Shafiq Joty
Furu Wei
LRM
212
16
0
17 Feb 2025
Evaluating Step-by-step Reasoning Traces: A Survey
Jinu Lee
Julia Hockenmaier
LRM
ELM
155
2
0
17 Feb 2025
Examining False Positives under Inference Scaling for Mathematical Reasoning
Yu Guang Wang
Nan Yang
Liang Wang
Furu Wei
LRM
144
4
0
10 Feb 2025
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
L. Yang
Zhaochen Yu
Tengjiao Wang
Mengdi Wang
ReLM
LRM
AI4CE
183
18
0
10 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
223
7
0
06 Feb 2025
SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
Jiawen Zhang
Kejia Chen
Zunlei Feng
Jian Lou
Mingli Song
Qingbin Liu
Xiaoyu Yang
AAML
SILM
FedML
173
1
0
02 Feb 2025
Mathematical Language Models: A Survey
Wen Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
235
14
0
03 Jan 2025
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Xingyu Chen
Jiahao Xu
Tian Liang
Zhiwei He
Jianhui Pang
...
Zizhuo Zhang
Rui Wang
Zhaopeng Tu
Haitao Mi
Dong Yu
LRM
ReLM
206
197
0
30 Dec 2024
Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models
Jonas Zausinger
Lars Pennig
Anamarija Kozina
Sean Sdahl
Julian Sikora
...
Anna Ketteler
Thorben Prein
Vishwa Mohan Singh
Michael Morris Danziger
Jannis Born
82
3
0
04 Nov 2024
Markov Chain of Thought for Efficient Mathematical Reasoning
Wen Yang
Kai Fan
Minpeng Liao
LRM
54
5
0
23 Oct 2024
MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps
Xiongtao Zhou
Jie He
Lanyu Chen
Jingyu Li
Haojing Chen
Víctor Gutiérrez-Basulto
Jeff Z. Pan
Ningyu Zhang
LRM
186
2
0
18 Oct 2024
Process Reward Model with Q-Value Rankings
W. Li
Yixuan Li
LRM
154
25
0
15 Oct 2024
Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs
Ishan Jindal
Chandana Badrinath
Pranjal Bharti
Lakkidi Vinay
Sachin Dev Sharma
CLL
ALM
85
1
0
14 Oct 2024
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure
Romain Puech
Jakub Macina
Julia Chatain
Mrinmaya Sachan
Manu Kapur
AI4Ed
103
5
0
03 Oct 2024
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Shengyu Feng
Xiang Kong
Shuang Ma
Aonan Zhang
Dong Yin
Chong-Jun Wang
Ruoming Pang
Yiming Yang
LRM
120
2
0
02 Oct 2024
Seek and Solve Reasoning for Table Question Answering
Ruya Jiang
Chun Wang
Weihong Deng
LMTD
ReLM
LRM
117
3
0
09 Sep 2024
PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars
Sumanth Prabhu
89
1
0
16 Aug 2024
CAVE: Controllable Authorship Verification Explanations
Sahana Ramnath
Kartik Pandey
Elizabeth Boschee
Xiang Ren
161
2
0
24 Jun 2024
DuetSim: Building User Simulator with Dual Large Language Models for Task-Oriented Dialogues
Xiang Luo
Zhiwen Tang
Jin Wang
Xuejie Zhang
108
6
0
16 May 2024
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Yunxiang Zhang
Muhammad Khalifa
Lajanugen Logeswaran
Jaekyeom Kim
Moontae Lee
Honglak Lee
Lu Wang
LRM
KELM
ReLM
120
43
0
26 Apr 2024
Compositional API Recommendation for Library-Oriented Code Generation
Zexiong Ma
Shengnan An
Bing Xie
Zeqi Lin
85
18
0
29 Feb 2024
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
Xiao Li
Bolin Zhu
Kaiwen Shi
Sichen Liu
Yin Zhu
Yiwei Liu
Gong Cheng
AIMat
90
1
0
20 Feb 2024
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Alon Jacovi
Yonatan Bitton
Bernd Bohnet
Jonathan Herzig
Or Honovich
Michael Tseng
Michael Collins
Roee Aharoni
Mor Geva
LRM
131
27
0
01 Feb 2024
Demystifying Chains, Trees, and Graphs of Thoughts
Maciej Besta
Florim Memedi
Zhenyu Zhang
Robert Gerstenberger
Guangyuan Piao
...
Aleš Kubíček
H. Niewiadomski
Aidan O'Mahony
Onur Mutlu
Torsten Hoefler
AI4CE
LRM
364
33
0
25 Jan 2024
ARGS: Alignment as Reward-Guided Search
Maxim Khanov
Jirayu Burapacheep
Yixuan Li
127
62
0
23 Jan 2024
Do LLM Agents Exhibit Social Behavior?
Yan Leng
Yuan Yuan
LLMAG
114
36
0
23 Dec 2023
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design
Yangyang Yu
Haohang Li
Zhi Chen
Yuechen Jiang
Yang Li
Denghui Zhang
Rong Liu
Jordan W. Suchow
K. Khashanah
103
72
0
23 Nov 2023
AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations
Zhicheng YANG
Yinya Huang
Jing Xiong
Liang Feng
Xiaodan Liang
Yiwei Wang
Jing Tang
LRM
87
2
0
22 Nov 2023
Towards A Unified View of Answer Calibration for Multi-Step Reasoning
Shumin Deng
Ningyu Zhang
Nay Oo
Bryan Hooi
LRM
89
3
0
15 Nov 2023
1
2
Next