Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.10832
Cited By
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
16 May 2025
Songjun Tu
Jiahao Lin
Qichao Zhang
Xiangyu Tian
Linjing Li
Xiangyuan Lan
Dongbin Zhao
OffRL
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL"
Title
No papers