Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.22617
Cited By
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
28 May 2025
Ganqu Cui
Yuchen Zhang
Jiacheng Chen
Lifan Yuan
Zhi Wang
Yuxin Zuo
Haozhan Li
Yuchen Fan
Huayu Chen
Weize Chen
Zhiyuan Liu
Hao Peng
Lei Bai
Wanli Ouyang
Yu Cheng
Bowen Zhou
Ning Ding
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models"
10 / 10 papers shown
Title
No Free Lunch: Rethinking Internal Feedback for LLM Reasoning
Yanzhi Zhang
Zhaoxi Zhang
Haoxiang Guan
Yilin Cheng
Yitong Duan
Chen Wang
Yue Wang
Shuxin Zheng
Jiyan He
ReLM
LRM
48
0
0
20 Jun 2025
Reasoning with Exploration: An Entropy Perspective
Daixuan Cheng
Shaohan Huang
Xuekai Zhu
Bo Dai
Wayne Xin Zhao
Zhenliang Zhang
Furu Wei
LRM
32
0
0
17 Jun 2025
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Shivam Agarwal
Zimin Zhang
Lifan Yuan
Jiawei Han
Hao Peng
164
8
0
21 May 2025
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo
Kaiyan Zhang
Li Sheng
Li Sheng
Xuekai Zhu
...
Youbang Sun
Zhiyuan Ma
Lifan Yuan
Ning Ding
Bowen Zhou
OffRL
419
31
0
22 Apr 2025
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRL
LRM
175
17
0
21 Apr 2025
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue
Zhiqi Chen
Rui Lu
Andrew Zhao
Zhaokai Wang
Yang Yue
Shiji Song
Gao Huang
ReLM
LRM
233
128
0
18 Apr 2025
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Qingyang Zhang
Haitao Wu
Changqing Zhang
Peilin Zhao
Yatao Bian
ReLM
LRM
186
19
0
08 Apr 2025
Understanding R1-Zero-Like Training: A Critical Perspective
Zichen Liu
Changyu Chen
Wenjun Li
Penghui Qi
Tianyu Pang
Chao Du
Wee Sun Lee
Min Lin
OffRL
LRM
228
172
0
26 Mar 2025
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
Zheng Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
...
Ya Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
OffRL
LRM
245
217
0
18 Mar 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zihao Huang
Ziyao Xu
Zhiyong Yang
Zonghan Yang
Zongyu Lin
OffRL
ALM
AI4TS
VLM
LRM
351
338
0
22 Jan 2025
1