Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.03492
Cited By
Teaching Language Models to Critique via Reinforcement Learning
5 February 2025
Zhihui Xie
Jie Chen
Lu Chen
Weichao Mao
Jingjing Xu
Dianbo Sui
ALM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Teaching Language Models to Critique via Reinforcement Learning"
8 / 8 papers shown
Title
Training Language Models to Generate Quality Code with Program Analysis Feedback
Feng Yao
Zilong Wang
Liyuan Liu
Junxia Cui
Li Zhong
Xiaohan Fu
Haohui Mai
Vish Krishnan
Jianfeng Gao
Jingbo Shang
30
0
0
28 May 2025
Think Only When You Need with Large Hybrid-Reasoning Models
Lingjie Jiang
Xun Wu
Shaohan Huang
Qingxiu Dong
Zewen Chi
Li Dong
Xingxing Zhang
Tengchao Lv
Lei Cui
Furu Wei
OffRL
LRM
149
5
0
20 May 2025
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
Xiaoyuan Liu
Tian Liang
Zhiwei He
Jiahao Xu
Wenxuan Wang
Pinjia He
Zhaopeng Tu
Haitao Mi
Dong Yu
OffRL
ReLM
LRM
102
0
0
19 May 2025
J4R: Learning to Judge with Equivalent Initial State Group Relative Policy Optimization
Austin Xu
Yilun Zhou
Xuan-Phi Nguyen
Caiming Xiong
Shafiq Joty
ELM
LRM
119
0
0
19 May 2025
DeepCritic: Deliberate Critique with Large Language Models
Wenkai Yang
Jingwen Chen
Yankai Lin
Ji-Rong Wen
ALM
LRM
96
1
0
01 May 2025
Neural Theorem Proving: Generating and Structuring Proofs for Formal Verification
Balaji Rao
William Eiers
Carlo Lipizzi
133
0
0
23 Apr 2025
Heimdall: test-time scaling on the generative verification
Wenlei Shi
Xing Jin
LRM
95
7
0
14 Apr 2025
Beyond Accuracy: The Role of Calibration in Self-Improving Large Language Models
Liangjie Huang
Dawei Li
Huan Liu
Lu Cheng
LRM
100
0
0
03 Apr 2025
1