Meta-aware Learning in text-to-SQL Large Language Model

25 May 2025

Abstract

The advancements of Large language models (LLMs) have provided great opportunities to text-to-SQL tasks to overcome the main challenges to understand complex domain information and complex database structures in business applications. In this paper, we propose a meta-aware learning framework to integrate domain knowledge, database schema, chain-of-thought reasoning processes, and metadata relationships to improve the SQL generation quality. The proposed framework includes four learning strategies: schema-based learning, Chain-of-Thought (CoT) learning, knowledge-enhanced learning, and key information tokenization. This approach provides a comprehensive understanding of database structure and metadata information towards LLM through fine-tuning to improve its performance on SQL generation within business domains. Through two experimental studies, we have demonstrated the superiority of the proposed methods in execution accuracy, multi-task SQL generation capability, and reduction of catastrophic forgetting.

View on arXiv

@article{zhang2025_2505.18929,
  title={ Meta-aware Learning in text-to-SQL Large Language Model },
  author={ Wenda Zhang },
  journal={arXiv preprint arXiv:2505.18929},
  year={ 2025 }
}

Comments on this paper