ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.17728
65
1
v1v2 (latest)

Dialectal and Low Resource Machine Translation for Aromanian

23 October 2024
Alexandru-Iulius Jerpelea
Alina-Ştefania Rădoi
Sergiu Nisioi
ArXiv (abs)PDFHTML
Main:10 Pages
3 Figures
Bibliography:3 Pages
20 Tables
Appendix:7 Pages
Abstract

We present a neural machine translation system that can translate between Romanian, English, and Aromanian (an endangered Eastern Romance language); the first of its kind. BLEU scores range from 17 to 32 depending on the direction and genre of the text. Alongside, we release the biggest known Aromanian-Romanian bilingual corpus, consisting of 79k cleaned sentence pairs. Additional tools such as an agnostic sentence embedder (used for both text mining and automatic evaluation) and a diacritics converter are also presented. We publicly release our findings and models. Finally, we describe the deployment of our quantized model at https://arotranslate.com.

View on arXiv
Comments on this paper