Mathesis: Towards Formal Theorem Proving from Natural Languages
- AIMatOffRLLRM

Recent advances in large language models show strong promise for formal reasoning. However, most LLM-based theorem provers have long been constrained by the need for expert-written formal statements as inputs, limiting their applicability to real-world problems expressed in natural language. We tackle this gap with Mathesis, the first end-to-end theorem proving pipeline processing informal problem statements. It contributes Mathesis-Autoformalizer, the first autoformalizer using reinforcement learning to enhance the formalization ability of natural language problems, aided by our novel LeanScorer framework for nuanced formalization quality assessment. It also proposes a Mathesis-Prover, which generates formal proofs from the formalized statements. To evaluate the real-world applicability of end-to-end formal theorem proving, we introduce Gaokao-Formal, a benchmark of 488 complex problems from China's national college entrance exam. Our approach is carefully designed, with a thorough study of each component. Experiments demonstrate Mathesis's effectiveness, with the autoformalizer outperforming the best baseline by 22% in pass-rate on Gaokao-Formal. The full system surpasses other model combinations, achieving 64% accuracy on MiniF2F with pass@32 and a state-of-the-art 18% on Gaokao-Formal.
View on arXiv@article{xuejun2025_2506.07047, title={ Mathesis: Towards Formal Theorem Proving from Natural Languages }, author={ Yu Xuejun and Jianyuan Zhong and Zijin Feng and Pengyi Zhai and Roozbeh Yousefzadeh and Wei Chong Ng and Haoxiong Liu and Ziyi Shou and Jing Xiong and Yudong Zhou and Claudia Beth Ong and Austen Jeremy Sugiarto and Yaoxi Zhang and Wai Ming Tai and Huan Cao and Dongcai Lu and Jiacheng Sun and Qiang Xu and Shen Xin and Zhenguo Li }, journal={arXiv preprint arXiv:2506.07047}, year={ 2025 } }