ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.03755
29
19

Multilingual Mathematical Autoformalization

7 November 2023
Albert Q. Jiang
Wenda Li
M. Jamnik
    AI4CE
ArXivPDFHTML
Abstract

Autoformalization is the task of translating natural language materials into machine-verifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence. Existing methods tend to circumvent this challenge by manually curating small corpora or using few-shot learning with large language models. But these methods suffer from data scarcity and formal language acquisition difficulty. In this work, we create MMA\texttt{MMA}MMA, a large, flexible, multilingual, and multi-domain dataset of informal-formal pairs, by using a language model to translate in the reverse direction, that is, from formal mathematical statements into corresponding informal ones. Experiments show that language models fine-tuned on MMA\texttt{MMA}MMA produce 16−18%16-18\%16−18% of statements acceptable with minimal corrections on the miniF2F\texttt{miniF2F}miniF2F and ProofNet\texttt{ProofNet}ProofNet benchmarks, up from 0%0\%0% with the base model. We demonstrate that fine-tuning on multilingual formal data results in more capable autoformalization models even when deployed on monolingual tasks.

View on arXiv
Comments on this paper