Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Abstract
Large Language Models (LLMs) struggle with complex reasoning due to limited diversity and inefficient search. We propose Soft Reasoning, an embedding-based search framework that optimises the embedding of the first token to guide generation. It combines (1) embedding perturbation for controlled exploration and (2) Bayesian optimisation to refine embeddings via a verifier-guided objective, balancing exploration and exploitation. This approach improves reasoning accuracy and coherence while avoiding reliance on heuristic search. Experiments demonstrate superior correctness with minimal computation, making it a scalable, model-agnostic solution.
View on arXiv@article{zhu2025_2505.24688, title={ Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration }, author={ Qinglin Zhu and Runcong Zhao and Hanqi Yan and Yulan He and Yudong Chen and Lin Gui }, journal={arXiv preprint arXiv:2505.24688}, year={ 2025 } }
Comments on this paper