Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.01296
Cited By
ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning
2 April 2025
Bairu Hou
Yang Zhang
Jiabao Ji
Yujian Liu
Kaizhi Qian
Jacob Andreas
Shiyu Chang
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning"
32 / 32 papers shown
Title
How does Transformer Learn Implicit Reasoning?
Jiaran Ye
Zijun Yao
Zhidian Huang
Liangming Pan
Jinxin Liu
...
Amy Xin
Liu Weichuan
Xiaoyin Che
Lei Hou
Juanzi Li
OffRL
ReLM
LRM
76
0
0
29 May 2025
PixelThink: Towards Efficient Chain-of-Pixel Reasoning
Song Wang
Gongfan Fang
Lingdong Kong
Xiangtai Li
Jianyun Xu
Sheng Yang
Qiang Li
Jianke Zhu
Xinchao Wang
LRM
108
0
0
29 May 2025
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models
Feng Luo
Yu-Neng Chuang
Guanchu Wang
Hoang Anh Duy Le
Shaochen Zhong
...
Jiayi Yuan
Yang Sui
Vladimir Braverman
Vipin Chaudhary
Helen Zhou
LRM
73
1
0
28 May 2025
Beyond Templates: Dynamic Adaptation of Reasoning Demonstrations via Feasibility-Aware Exploration
Yong Wu
Weihang Pan
Ke Li
Chen Binhui
Ping Li
Binbin Lin
LRM
59
0
0
27 May 2025
Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
Sohyun An
Ruochen Wang
Tianyi Zhou
Cho-Jui Hsieh
KELM
LRM
84
1
0
27 May 2025
Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners
Jiabao Ji
Yongchao Chen
Yang Zhang
Ramana Rao Kompella
Chuchu Fan
Gaowen Liu
Shiyu Chang
97
0
0
26 May 2025
ARM: Adaptive Reasoning Model
Siye Wu
Jian Xie
Yikai Zhang
Aili Chen
Kai Zhang
Yu Su
Yanghua Xiao
LRM
74
0
0
26 May 2025
AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting
Shijue Huang
Hongru Wang
Wanjun Zhong
Zhaochen Su
Jiazhan Feng
Bowen Cao
Yi R. Fung
OffRL
LRM
167
2
0
24 May 2025
VeriThinker: Learning to Verify Makes Reasoning Model Efficient
Zigeng Chen
Xinyin Ma
Gongfan Fang
Ruonan Yu
Xinchao Wang
LRM
161
1
0
23 May 2025
Activation-Guided Consensus Merging for Large Language Models
Yuxuan Yao
Shuqi Liu
Zehua Liu
Qintong Li
Mingyang Liu
Xiongwei Han
Zhijiang Guo
Han Wu
Linqi Song
MoMe
139
0
0
20 May 2025
HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization
Chengyu Huang
Zhengxin Zhang
Claire Cardie
LRM
116
0
0
16 May 2025
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Songjun Tu
Jiahao Lin
Qichao Zhang
Xiangyu Tian
Linjing Li
Xiangyuan Lan
Dongbin Zhao
OffRL
ReLM
LRM
95
3
0
16 May 2025
Making Small Language Models Efficient Reasoners: Intervention, Supervision, Reinforcement
Xuechen Zhang
Zijian Huang
Chenshun Ni
Ziyang Xiong
Jiasi Chen
Samet Oymak
ReLM
LRM
136
3
0
12 May 2025
Efficient Reasoning for LLMs through Speculative Chain-of-Thought
Jikai Wang
Junlin Li
Jianye Hou
Hao Fei
Lijun Wu
Min Zhang
LLMAG
LRM
128
5
0
27 Apr 2025
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
386
14
0
15 Apr 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
180
137
0
24 Mar 2025
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Yang Sui
Yu-Neng Chuang
Guanchu Wang
Jiamu Zhang
Tianyi Zhang
...
Hongyi Liu
Andrew Wen
Shaochen
Zhong
Hanjie Chen
OffRL
ReLM
LRM
198
100
0
20 Mar 2025
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
Kanishk Gandhi
Ayush Chakravarthy
Anikait Singh
Nathan Lile
Noah D. Goodman
ReLM
LRM
155
111
0
03 Mar 2025
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Zhenyi Shen
Hanqi Yan
Linhai Zhang
Zhanghao Hu
Yali Du
Yulan He
LRM
159
27
0
28 Feb 2025
LIMO: Less is More for Reasoning
Yixin Ye
Zhen Huang
Yang Xiao
Ethan Chern
Shijie Xia
Pengfei Liu
LRM
164
167
0
05 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs
A. Kumar
Jaechul Roh
A. Naseh
Marzena Karpinska
Mohit Iyyer
Amir Houmansadr
Eugene Bagdasarian
LRM
135
25
0
04 Feb 2025
Process Reinforcement through Implicit Rewards
Ganqu Cui
Lifan Yuan
Ziyi Wang
Hanbin Wang
Wendi Li
...
Yu Cheng
Zhiyuan Liu
Maosong Sun
Bowen Zhou
Ning Ding
OffRL
LRM
157
103
0
03 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
380
2,000
0
22 Jan 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zihao Huang
Ziyao Xu
Zhiyong Yang
Zonghan Yang
Zongyu Lin
OffRL
ALM
AI4TS
VLM
LRM
287
338
0
22 Jan 2025
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Xingyu Chen
Jiahao Xu
Tian Liang
Zhiwei He
Jianhui Pang
...
Zizhuo Zhang
Rui Wang
Zhaopeng Tu
Haitao Mi
Dong Yu
LRM
ReLM
185
197
0
30 Dec 2024
HybridFlow: A Flexible and Efficient RLHF Framework
Guangming Sheng
Chi Zhang
Zilingfeng Ye
Xibin Wu
Wang Zhang
Ru Zhang
Size Zheng
Haibin Lin
Chuan Wu
AI4CE
186
241
0
28 Sep 2024
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAG
ReLM
LRM
125
151
0
14 Mar 2024
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Chaoqun He
Renjie Luo
Yuzhuo Bai
Shengding Hu
Zhen Leng Thai
...
Yuxiang Zhang
Jie Liu
Lei Qi
Zhiyuan Liu
Maosong Sun
ELM
AIMat
124
282
0
21 Feb 2024
V-STaR: Training Verifiers for Self-Taught Reasoners
Arian Hosseini
Xingdi Yuan
Nikolay Malkin
Rameswar Panda
Alessandro Sordoni
Rishabh Agarwal
ReLM
LRM
93
137
0
09 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
163
1,287
0
05 Feb 2024
Let's Verify Step by Step
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
ALM
OffRL
LRM
198
1,240
0
31 May 2023
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
316
6,709
0
08 Jun 2015
1