SELT: Self-Evaluation Tree Search for LLMs with Task Decomposition

9 June 2025

Main:8 Pages

5 Figures

Bibliography:2 Pages

5 Tables

Appendix:1 Pages

Abstract

While Large Language Models (LLMs) have achieved remarkable success in a wide range of applications, their performance often degrades in complex reasoning tasks. In this work, we introduce SELT (Self-Evaluation LLM Tree Search), a novel framework that leverages a modified Monte Carlo Tree Search (MCTS) to enhance LLM reasoning without relying on external reward models. By redefining the Upper Confidence Bound scoring to align with intrinsic self-evaluation capabilities of LLMs and decomposing the inference process into atomic subtasks augmented with semantic clustering at each node, SELT effectively balances exploration and exploitation, reduces redundant reasoning paths, and mitigates hallucination. We validate our approach on challenging benchmarks, including the knowledge-based MMLU and the Tool Learning dataset Seal-Tools, where SELT achieves significant improvements in answer accuracy and reasoning robustness compared to baseline methods. Notably, our framework operates without task-specific fine-tuning, demonstrating strong generalizability across diverse reasoning tasks. Relevant results and code are available atthis https URL.

View on arXiv

@article{wu2025_2506.07557,
  title={ SELT: Self-Evaluation Tree Search for LLMs with Task Decomposition },
  author={ Mengsong Wu and Di Zhang and Yuqiang Li and Dongzhan Zhou and Wenliang Chen },
  journal={arXiv preprint arXiv:2506.07557},
  year={ 2025 }
}

Comments on this paper