ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.03373
  4. Cited By
Demystifying Long Chain-of-Thought Reasoning in LLMs

Demystifying Long Chain-of-Thought Reasoning in LLMs

5 February 2025
Edward Yeo
Yuxuan Tong
Morry Niu
Graham Neubig
Xiang Yue
    OffRL
    LRM
ArXivPDFHTML

Papers citing "Demystifying Long Chain-of-Thought Reasoning in LLMs"

50 / 83 papers shown
Title
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models
Feng Luo
Yu-Neng Chuang
Guanchu Wang
Hoang Anh Duy Le
Shaochen Zhong
...
Jiayi Yuan
Yang Sui
Vladimir Braverman
Vipin Chaudhary
Xia Hu
LRM
33
1
0
28 May 2025
Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
Sohyun An
Ruochen Wang
Tianyi Zhou
Cho-Jui Hsieh
KELM
LRM
33
0
0
27 May 2025
Beyond Templates: Dynamic Adaptation of Reasoning Demonstrations via Feasibility-Aware Exploration
Beyond Templates: Dynamic Adaptation of Reasoning Demonstrations via Feasibility-Aware Exploration
Yong Wu
Weihang Pan
Ke Li
Chen Binhui
Ping Li
Binbin Lin
LRM
20
0
0
27 May 2025
Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning
Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning
Yang He
Xiao Ding
Bibo Cai
Yufei Zhang
Kai Xiong
Zhouhao Sun
Bing Qin
Ting Liu
LRM
15
0
0
27 May 2025
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers
Zhengliang Shi
Lingyong Yan
Dawei Yin
Suzan Verberne
Maarten de Rijke
Zhaochun Ren
LRM
35
0
0
26 May 2025
REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models
REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Jun Rao
Min Zhang
OffRL
LRM
24
0
0
26 May 2025
Adaptive Deep Reasoning: Triggering Deep Thinking When Needed
Adaptive Deep Reasoning: Triggering Deep Thinking When Needed
Yunhao Wang
Yuhao Zhang
T. Yu
Can Xu
Feng Zhang
Fengzong Lian
OffRL
LRM
12
0
0
26 May 2025
Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers
Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers
Rihui Xin
Han Liu
Zecheng Wang
Yupeng Zhang
Dianbo Sui
Xiaolin Hu
Bingning Wang
SyDa
16
1
0
26 May 2025
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
Weize Chen
Jiarui Yuan
Tailin Jin
Ning Ding
Huimin Chen
Zhiyuan Liu
Maosong Sun
OffRL
MQ
26
0
0
25 May 2025
Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation
Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation
Jiwan Chung
Junhyeok Kim
Siyeol Kim
Jaeyoung Lee
Min Soo Kim
Youngjae Yu
LRM
39
0
0
24 May 2025
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Bryan S Kim
Jeongsol Kim
Jong Chul Ye
45
0
0
24 May 2025
AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting
AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting
Shijue Huang
Hongru Wang
Wanjun Zhong
Zhaochen Su
Jiazhan Feng
Bowen Cao
Yi R. Fung
OffRL
LRM
79
0
0
24 May 2025
Not All Tokens Are What You Need In Thinking
Hang Yuan
Bin Yu
Haotian Li
Shijun Yang
Christina Dan Wang
Zhou Yu
X. Xu
Weizhen Qi
Kai Chen
LRM
45
0
0
23 May 2025
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning
Yutong Chen
Jiandong Gao
Ji Wu
ALM
115
0
0
23 May 2025
VeriThinker: Learning to Verify Makes Reasoning Model Efficient
Zigeng Chen
Xinyin Ma
Gongfan Fang
Ruonan Yu
Xinchao Wang
LRM
109
0
0
23 May 2025
Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning
Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning
Cehao Yang
Xueyuan Lin
Chengjin Xu
Xuhui Jiang
Xiaojun Wu
Honghao Liu
Hui Xiong
Jian Guo
LRM
44
0
0
22 May 2025
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
Razvan-Gabriel Dumitru
Darius Peteleaza
Vikas Yadav
Liangming Pan
ReLM
LRM
30
0
0
22 May 2025
Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities
Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities
Jinyang Wu
Chonghua Liao
Mingkuan Feng
Shuai Zhang
Zhengqi Wen
Pengpeng Shao
Huazhe Xu
Jianhua Tao
LRM
OffRL
53
0
0
21 May 2025
Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
Wei Liu
Ruochen Zhou
Yiyun Deng
Yuzhen Huang
Junteng Liu
Yuntian Deng
Yizhe Zhang
Junxian He
OffRL
LRM
14
0
0
21 May 2025
When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners
When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners
Weixiang Zhao
Jiahe Guo
Yang Deng
Tongtong Wu
Wenxuan Zhang
...
Yanyan Zhao
Wanxiang Che
Bing Qin
Tat-Seng Chua
Ting Liu
LRM
72
0
0
21 May 2025
Self-Evolving Curriculum for LLM Reasoning
Self-Evolving Curriculum for LLM Reasoning
Xiaoyin Chen
Jiarui Lu
Minsu Kim
Dinghuai Zhang
Jian Tang
Alexandre Piché
Nicolas Angelard-Gontier
Yoshua Bengio
Ehsan Kamalloo
ReLM
LRM
75
0
0
20 May 2025
FlashThink: An Early Exit Method For Efficient Reasoning
FlashThink: An Early Exit Method For Efficient Reasoning
Guochao Jiang
Guofeng Quan
Zepeng Ding
Ziqin Luo
Dixuan Wang
Zheng Hu
ReLM
LRM
37
1
0
20 May 2025
SHARP: Synthesizing High-quality Aligned Reasoning Problems for Large Reasoning Models Reinforcement Learning
SHARP: Synthesizing High-quality Aligned Reasoning Problems for Large Reasoning Models Reinforcement Learning
Xiong Jun Wu
Zhenduo Zhang
ZuJie Wen
Zhiqiang Zhang
Wang Ren
...
Xudong Han
Chengfu Tang
Dingnan Jin
Qing Cui
Jun Zhou
LRM
91
0
0
20 May 2025
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Haoran Zhao
Yuchen Yan
Yongliang Shen
Haolei Xu
Wenqi Zhang
Kaitao Song
Jian Shao
Weiming Lu
Jun Xiao
Yueting Zhuang
LRM
46
0
0
20 May 2025
Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning
Reinforcement Learning vs. Distillation: Understanding Accuracy and Capability in LLM Reasoning
Minwu Kim
Anubhav Shrestha
Safal Shrestha
Aadim Nepal
Keith Ross
31
0
0
20 May 2025
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
Penghui Qi
Zichen Liu
Tianyu Pang
Chao Du
W. Lee
Min Lin
OffRL
LRM
35
0
0
19 May 2025
ToTRL: Unlock LLM Tree-of-Thoughts Reasoning Potential through Puzzles Solving
ToTRL: Unlock LLM Tree-of-Thoughts Reasoning Potential through Puzzles Solving
Haoyuan Wu
Xueyi Chen
Rui Ming
Jilong Gao
Shoubo Hu
Zhuolun He
Bei Yu
LRM
60
0
0
19 May 2025
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space
Hengli Li
Chenxi Li
Tong Wu
Xuekai Zhu
Yuxuan Wang
...
Eric Hanchen Jiang
Song-Chun Zhu
Zixia Jia
Ying Nian Wu
Zilong Zheng
LRM
70
0
0
19 May 2025
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
Xiaoyuan Liu
Tian Liang
Zhiwei He
Jiahao Xu
Wenxuan Wang
Pinjia He
Zhaopeng Tu
Haitao Mi
Dong Yu
OffRL
ReLM
LRM
64
0
0
19 May 2025
Efficient RL Training for Reasoning Models via Length-Aware Optimization
Efficient RL Training for Reasoning Models via Length-Aware Optimization
Danlong Yuan
Tian Xie
Shaohan Huang
Zhuocheng Gong
Huishuai Zhang
Chong Luo
Furu Wei
Dongyan Zhao
OffRL
LRM
VLM
46
1
0
18 May 2025
LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?
LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?
Maoyuan Ye
Jing Zhang
Juhua Liu
Bo Du
Dacheng Tao
LRM
89
0
0
18 May 2025
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
Hao Mark Chen
Guanxi Lu
Yasuyuki Okoshi
Zhiwen Mo
Masato Motomura
Hongxiang Fan
LRM
71
0
0
16 May 2025
HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization
HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization
Chengyu Huang
Zhengxin Zhang
Claire Cardie
LRM
69
0
0
16 May 2025
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning
Zheng Li
Qingxiu Dong
Jingyuan Ma
Di Zhang
Zhifang Sui
LRM
42
0
0
16 May 2025
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
Seongyun Lee
Seungone Kim
Minju Seo
Yongrae Jo
Dongyoung Go
...
Xiang Yue
Sean Welleck
Graham Neubig
Moontae Lee
Minjoon Seo
LRM
62
1
0
15 May 2025
Practical Reasoning Interruption Attacks on Reasoning Large Language Models
Practical Reasoning Interruption Attacks on Reasoning Large Language Models
Yu Cui
Cong Zuo
SILM
AAML
LRM
53
0
0
10 May 2025
Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models
Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models
Bin Yu
Hang Yuan
Haotian Li
X. Xu
Yuliang Wei
Bailing Wang
Weizhen Qi
Kai Chen
LRM
57
2
0
06 May 2025
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
Qianchu Liu
Sheng Zhang
Guanghui Qin
Timothy Ossowski
Yu Gu
...
Sam Preston
Mu-Hsin Wei
Paul Vozila
Tristan Naumann
Hoifung Poon
OOD
LRM
VLM
75
6
0
06 May 2025
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
Zhiting Fan
Ruizhe Chen
Zuozhu Liu
76
0
0
30 Apr 2025
Phi-4-reasoning Technical Report
Phi-4-reasoning Technical Report
Marah Abdin
Sahaj Agarwal
Ahmed Hassan Awadallah
Vidhisha Balachandran
Harkirat Singh Behl
...
Vaishnavi Shrivastava
Vibhav Vineet
Yue Wu
Safoora Yousefi
Guoqing Zheng
ReLM
LRM
138
9
0
30 Apr 2025
Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs
Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs
Jinyan Su
Jennifer Healey
Preslav Nakov
Claire Cardie
LRM
240
9
0
30 Apr 2025
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think
Hasan Hammoud
Hani Itani
Guohao Li
ReLM
LRM
92
2
0
29 Apr 2025
Fast-Slow Thinking for Large Vision-Language Model Reasoning
Fast-Slow Thinking for Large Vision-Language Model Reasoning
W. L. Xiao
Leilei Gan
Weilong Dai
Wanggui He
Ziwei Huang
...
Fangxun Shu
Zhelun Yu
Peng Zhang
Hao Jiang
Leilei Gan
ReLM
LRM
AI4CE
375
7
0
25 Apr 2025
Dynamic Early Exit in Reasoning Models
Dynamic Early Exit in Reasoning Models
Chenxu Yang
Qingyi Si
Yongjie Duan
Zheliang Zhu
Chenyu Zhu
Zheng Lin
Zheng Lin
Li Cao
Weiping Wang
ReLM
LRM
98
14
0
22 Apr 2025
SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM
SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM
Xinyu Zhang
Jiadong Wang
Zifei Cheng
Wenhao Zhuang
Zheng Lin
...
Shouyu Yin
Chaohang Wen
Haotian Zhang
Bin Chen
Bing Yu
LRM
113
7
0
19 Apr 2025
LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models
LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models
Kang He
Kaushik Roy
LRM
43
0
0
18 Apr 2025
CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
Feiyang Li
Peng Fang
Zhan Shi
Arijit Khan
Fang Wang
Dan Feng
Weihao Wang
Xin Zhang
Yongjian Cui
ReLM
LRM
73
1
0
18 Apr 2025
Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning
Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning
Syeda Nahida Akter
Shrimai Prabhumoye
Matvei Novikov
Seungju Han
Ying Lin
...
Eric Nyberg
Yejin Choi
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
ReLM
OffRL
LRM
371
2
1
15 Apr 2025
S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
Wenyuan Zhang
Jiawei Sheng
Xinghua Zhang
Zefeng Zhang
Tingwen Liu
ELM
LRM
69
4
0
14 Apr 2025
MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning
MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning
Zhaopeng Feng
Shaosheng Cao
Jiahan Ren
Jiayuan Su
Ruizhe Chen
Yan Zhang
Zhe Xu
Yao Hu
Jian Wu
Zuozhu Liu
ALM
LRM
85
9
0
14 Apr 2025
12
Next