ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.18683
34
0

TULUN: Transparent and Adaptable Low-resource Machine Translation

24 May 2025
Raphael Merx
Hanna Suominen
Lois Hong
Nick Thieberger
Trevor Cohn
Ekaterina Vylomova
ArXiv (abs)PDFHTML
Main:6 Pages
3 Figures
Bibliography:3 Pages
4 Tables
Appendix:1 Pages
Abstract

Machine translation (MT) systems that support low-resource languages often struggle on specialized domains. While researchers have proposed various techniques for domain adaptation, these approaches typically require model fine-tuning, making them impractical for non-technical users and small organizations. To address this gap, we propose Tulun, a versatile solution for terminology-aware translation, combining neural MT with large language model (LLM)-based post-editing guided by existing glossaries and translation memories. Our open-source web-based platform enables users to easily create, edit, and leverage terminology resources, fostering a collaborative human-machine translation process that respects and incorporates domain expertise while increasing MT accuracy. Evaluations show effectiveness in both real-world and benchmark scenarios: on medical and disaster relief translation tasks for Tetun and Bislama, our system achieves improvements of 16.90-22.41 ChrF++ points over baseline MT systems. Across six low-resource languages on the FLORES dataset, Tulun outperforms both standalone MT and LLM approaches, achieving an average improvement of 2.8 ChrF points over NLLB-54B.

View on arXiv
@article{merx2025_2505.18683,
  title={ TULUN: Transparent and Adaptable Low-resource Machine Translation },
  author={ Raphaël Merx and Hanna Suominen and Lois Hong and Nick Thieberger and Trevor Cohn and Ekaterina Vylomova },
  journal={arXiv preprint arXiv:2505.18683},
  year={ 2025 }
}
Comments on this paper