ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.02577
40
0

A comparison of translation performance between DeepL and Supertext

4 February 2025
Alex Flückiger
Chantal Amrhein
Tim Graf
Frédéric Odermatt
Martin Pömsl
Philippe Schläpfer
Florian Schottmann
Samuel Laubli
    ELM
ArXivPDFHTML
Abstract

As strong machine translation (MT) systems are increasingly based on large language models (LLMs), reliable quality benchmarking requires methods that capture their ability to leverage extended context. This study compares two commercial MT systems -- DeepL and Supertext -- by assessing their performance on unsegmented texts. We evaluate translation quality across four language directions with professional translators assessing segments with full document-level context. While segment-level assessments indicate no strong preference between the systems in most cases, document-level analysis reveals a preference for Supertext in three out of four language directions, suggesting superior consistency across longer texts. We advocate for more context-sensitive evaluation methodologies to ensure that MT quality assessments reflect real-world usability. We release all evaluation data and scripts for further analysis and reproduction atthis https URL.

View on arXiv
@article{flückiger2025_2502.02577,
  title={ A comparison of translation performance between DeepL and Supertext },
  author={ Alex Flückiger and Chantal Amrhein and Tim Graf and Frédéric Odermatt and Martin Pömsl and Philippe Schläpfer and Florian Schottmann and Samuel Läubli },
  journal={arXiv preprint arXiv:2502.02577},
  year={ 2025 }
}
Comments on this paper