ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.02658
  4. Cited By
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on
  Model-induced Process Supervision

Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision

5 February 2024
Zihan Wang
Yunxuan Li
Yuexin Wu
Liangchen Luo
Le Hou
Hongkun Yu
Jingbo Shang
    LRM
ArXivPDFHTML

Papers citing "Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision"

19 / 19 papers shown
Title
Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents
Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents
Karina Zainullina
Alexander Golubev
Maria Trofimova
Sergei Polezhaev
Ibragim Badertdinov
...
Filipp Fisin
Sergei Skvortsov
Maksim Nekrashevich
Anton Shevtsov
Boris Yangel
2
0
0
19 May 2025
MARGE: Improving Math Reasoning for LLMs with Guided Exploration
MARGE: Improving Math Reasoning for LLMs with Guided Exploration
Jingyue Gao
Runji Lin
Keming Lu
Bowen Yu
Junyang Lin
Jianyu Chen
LRM
9
0
0
18 May 2025
Lightweight Latent Verifiers for Efficient Meta-Generation Strategies
Lightweight Latent Verifiers for Efficient Meta-Generation Strategies
Bartosz Piotrowski
Witold Drzewakowski
Konrad Staniszewski
Piotr Miłoś
LRM
36
0
0
23 Apr 2025
Exploring Expert Failures Improves LLM Agent Tuning
Exploring Expert Failures Improves LLM Agent Tuning
Li-Cheng Lan
Andrew Bai
Minhao Cheng
Ruochen Wang
Cho-Jui Hsieh
LRM
192
0
0
17 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
45
2
0
12 Apr 2025
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Bingchen Miao
Y. Wu
Minghe Gao
Qifan Yu
Wendong Bu
Wenqiao Zhang
Yunfei Li
Siliang Tang
Tat-Seng Chua
Juncheng Billy Li
LLMAG
LRM
61
0
0
24 Mar 2025
A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1
A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1
Jun Wang
LRM
KELM
70
2
0
15 Feb 2025
VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data
Thomas Zeng
Shuibai Zhang
Shutong Wu
Christian Classen
Daewon Chae
...
Jungtaek Kim
H. Koo
Kannan Ramchandran
Dimitris Papailiopoulos
Kangwook Lee
LRM
74
2
0
10 Feb 2025
Flow-DPO: Improving LLM Mathematical Reasoning through Online
  Multi-Agent Learning
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
Yihe Deng
Paul Mineiro
LRM
26
3
0
29 Oct 2024
Process Supervision-Guided Policy Optimization for Code Generation
Process Supervision-Guided Policy Optimization for Code Generation
Ning Dai
Zheng Wu
Renjie Zheng
Ziyun Wei
Wenlei Shi
Xing Jin
Guanlin Liu
Chen Dun
Liang Huang
Lin Yan
56
8
0
23 Oct 2024
OpenR: An Open Source Framework for Advanced Reasoning with Large
  Language Models
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Jun Wang
Meng Fang
Bo Liu
Muning Wen
Jiachen Zhu
...
Lei Chen
Lionel M. Ni
Linyi Yang
Ying Wen
Weixi Zhang
LRM
32
31
0
12 Oct 2024
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language
  Model Mathematical Reasoning
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning
Ruosen Li
Ziming Luo
Xinya Du
LRM
34
0
0
08 Oct 2024
Improving LLM Reasoning through Scaling Inference Computation with
  Collaborative Verification
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
Zhenwen Liang
Ye Liu
Tong Niu
Xiangliang Zhang
Yingbo Zhou
Semih Yavuz
LRM
34
18
0
05 Oct 2024
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Shengyu Feng
Xiang Kong
Shuang Ma
Aonan Zhang
Dong Yin
Chong-Jun Wang
Ruoming Pang
Yiming Yang
LRM
32
0
0
02 Oct 2024
Improve Mathematical Reasoning in Language Models by Automated Process
  Supervision
Improve Mathematical Reasoning in Language Models by Automated Process Supervision
Liangchen Luo
Yinxiao Liu
Rosanne Liu
Samrat Phatale
Harsh Lara
...
Lei Shu
Yun Zhu
Lei Meng
Jiao Sun
Abhinav Rastogi
LRM
51
140
0
05 Jun 2024
Process-Driven Autoformalization in Lean 4
Process-Driven Autoformalization in Lean 4
Jianqiao Lu
Zhengying Liu
Yingjia Wan
Yinya Huang
Haiming Wang
Zhicheng YANG
Jing Tang
Zhijiang Guo
AI4CE
45
16
0
04 Jun 2024
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Zhiqing Sun
Longhui Yu
Yikang Shen
Weiyang Liu
Yiming Yang
Sean Welleck
Chuang Gan
36
54
0
14 Mar 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
309
0
05 Jan 2024
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
320
3,273
0
21 Mar 2022
1