Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.03469
Cited By
Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models
6 May 2025
Bin Yu
Hang Yuan
Yuliang Wei
Bailing Wang
Weizhen Qi
Kai Chen
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models"
1 / 1 papers shown
Title
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Songjun Tu
Jiahao Lin
Qichao Zhang
Xiangyu Tian
Linjing Li
Xiangyuan Lan
Dongbin Zhao
OffRL
ReLM
LRM
21
0
0
16 May 2025
1