Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning

27 May 2025

Abstract

While reasoning-augmented large language models (RLLMs) significantly enhance complex task performance through extended reasoning chains, they inevitably introduce substantial unnecessary token consumption, particularly for simpler problems where Short Chain-of-Thought (Short CoT) suffices. This overthinking phenomenon leads to inefficient resource usage without proportional accuracy gains. To address this issue, we propose Self-Route, a dynamic reasoning framework that automatically selects between general and reasoning modes based on model capability estimation. Our approach introduces a lightweight pre-inference stage to extract capability-aware embeddings from hidden layer representations, enabling real-time evaluation of the model's ability to solve problems. We further construct Gradient-10K, a model difficulty estimation-based dataset with dense complexity sampling, to train the router for precise capability boundary detection. Extensive experiments demonstrate that Self-Route achieves comparable accuracy to reasoning models while reducing token consumption by 30-55\% across diverse benchmarks. The proposed framework demonstrates consistent effectiveness across models with different parameter scales and reasoning paradigms, highlighting its general applicability and practical value.

View on arXiv

@article{he2025_2505.20664,
  title={ Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning },
  author={ Yang He and Xiao Ding and Bibo Cai and Yufei Zhang and Kai Xiong and Zhouhao Sun and Bing Qin and Ting Liu },
  journal={arXiv preprint arXiv:2505.20664},
  year={ 2025 }
}

Comments on this paper