69

IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

Tao Liu
Jiafan Lu
Bohan Yu
Pengcheng Wu
Liu Haixin
Guoyu Xu
Li Xiangheng
Lixiao Li
Jiaming Hou
Zhao Shijun
Xinglin Lyu
Kunli Zhang
Yuxiang Jia
Hongyin Zan
Main:8 Pages
16 Figures
Bibliography:3 Pages
8 Tables
Appendix:14 Pages
Abstract

Text-to-SQL is a key natural language processing task that maps natural language questions to SQL queries, enabling intuitive interaction with web-based databases. Although current methods perform well on benchmarks like BIRD and Spider, they struggle with complex reasoning, domain knowledge, and hypothetical queries, and remain costly in enterprise deployment. To address these issues, we propose a framework named IESR(Information Enhanced Structured Reasoning) for lightweight large language models: (i) leverages LLMs for key information understanding and schema linking, and decoupling mathematical computation and SQL generation, (ii) integrates a multi-path reasoning mechanism based on Monte Carlo Tree Search (MCTS) with majority voting, and (iii) introduces a trajectory consistency verification module with a discriminator model to ensure accuracy and consistency. Experimental results demonstrate that IESR achieves state-of-the-art performance on the complex reasoning benchmark LogicCat (24.28 EX) and the Archer dataset (37.28 EX) using only compact lightweight models without fine-tuning. Furthermore, our analysis reveals that current coder models exhibit notable biases and deficiencies in physical knowledge, mathematical computation, and common-sense reasoning, highlighting important directions for future research. We released code atthis https URL.

View on arXiv
Comments on this paper