Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.17621
Cited By
Process Supervision-Guided Policy Optimization for Code Generation
23 October 2024
Ning Dai
Zheng Wu
Renjie Zheng
Ziyun Wei
Wenlei Shi
Xing Jin
Guanlin Liu
Chen Dun
Liang Huang
Lin Yan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Process Supervision-Guided Policy Optimization for Code Generation"
6 / 6 papers shown
Title
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Yilun Zhou
Austin Xu
Peifeng Wang
Caiming Xiong
Chenyu You
ELM
ALM
LRM
58
3
0
21 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
45
3
0
12 Apr 2025
IterPref: Focal Preference Learning for Code Generation via Iterative Debugging
Jie Wu
Haoling Li
Xin Zhang
Jianwen Luo
Yangyu Huang
Ruihang Chu
Yuqing Yang
Scarlett Li
75
0
0
04 Mar 2025
Process-Supervised Reinforcement Learning for Code Generation
Yufan Ye
Ting Zhang
Wenbin Jiang
Hua Huang
OffRL
LRM
SyDa
63
1
0
03 Feb 2025
Outcome-Refining Process Supervision for Code Generation
Zhuohao Yu
Weizheng Gu
Yidong Wang
Zhengran Zeng
Jindong Wang
Wei Ye
Shikun Zhang
LRM
92
4
0
19 Dec 2024
Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling
Junyi Li
Hwee Tou Ng
LRM
97
1
0
19 Dec 2024
1