Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.06639
Cited By
Reinforcement Learning with Verifiable Rewards: GRPO's Effective Loss, Dynamics, and Success Amplification
9 March 2025
Youssef Mroueh
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reinforcement Learning with Verifiable Rewards: GRPO's Effective Loss, Dynamics, and Success Amplification"
5 / 5 papers shown
Title
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
Gang Li
Ming Lin
Tomer Galanti
Zhengzhong Tu
Tianbao Yang
19
0
0
18 May 2025
VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation
Yiting Wang
Guoheng Sun
Wanghao Ye
Gang Qu
Ang Li
OffRL
3DV
LRM
VLM
17
0
0
17 May 2025
Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs
Yaorui Shi
Shihan Li
Chang Wu
Zhiyuan Liu
Sihang Li
Hengxing Cai
An Zhang
Xinbing Wang
ReLM
LRM
36
0
0
16 May 2025
MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance
Chen Hu
Timothy Neate
Shan Luo
Letizia Gionfrida
55
0
0
04 Apr 2025
Measurement of LLM's Philosophies of Human Nature
Minheng Ni
Ennan Wu
Zidong Gong
Zheng Yang
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
Lijuan Wang
Wangmeng Zuo
37
0
0
03 Apr 2025
1