ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.14478
  4. Cited By
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation
  for Machine Translation

Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

29 April 2021
Markus Freitag
George F. Foster
David Grangier
Viresh Ratnakar
Qijun Tan
Wolfgang Macherey
ArXivPDFHTML

Papers citing "Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation"

50 / 96 papers shown
Title
MAATS: A Multi-Agent Automated Translation System Based on MQM Evaluation
MAATS: A Multi-Agent Automated Translation System Based on MQM Evaluation
Xi Wang
Jiaqian Hu
Safinah Ali
9
0
0
20 May 2025
Calibrating Translation Decoding with Quality Estimation on LLMs
Calibrating Translation Decoding with Quality Estimation on LLMs
Di Wu
Yibin Lei
Christof Monz
75
0
0
26 Apr 2025
Testing LLMs' Capabilities in Annotating Translations Based on an Error Typology Designed for LSP Translation: First Experiments with ChatGPT
Testing LLMs' Capabilities in Annotating Translations Based on an Error Typology Designed for LSP Translation: First Experiments with ChatGPT
Joachim Minder
Guillaume Wisniewski
Natalie Kübler
33
0
0
21 Apr 2025
Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Diana Galván-Sosa
Gabrielle Gaudeau
Pride Kavumba
Yunmeng Li
Hongyi gu
Zheng Yuan
Keisuke Sakaguchi
P. Buttery
LRM
40
0
0
31 Mar 2025
Self-Vocabularizing Training for Neural Machine Translation
Self-Vocabularizing Training for Neural Machine Translation
Pin-Jie Lin
Ernie Chang
Yangyang Shi
Vikas Chandra
71
0
0
18 Mar 2025
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation
Xiang Geng
Zhejian Lai
Jiajun Chen
Hao Yang
Shujian Huang
62
0
0
27 Feb 2025
Enhancing Human Evaluation in Machine Translation with Comparative Judgment
Enhancing Human Evaluation in Machine Translation with Comparative Judgment
Yixiao Song
Parker Riley
Daniel Deutsch
Markus Freitag
68
1
0
25 Feb 2025
Automatic Input Rewriting Improves Translation with Large Language Models
Automatic Input Rewriting Improves Translation with Large Language Models
Dayeon Ki
Marine Carpuat
46
0
0
23 Feb 2025
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
Zhaopeng Feng
Jiayuan Su
Jiamei Zheng
Jiahan Ren
Yan Zhang
Jian Wu
Hongwei Wang
Zuozhu Liu
ELM
208
0
0
21 Feb 2025
Aligning Black-box Language Models with Human Judgments
Aligning Black-box Language Models with Human Judgments
Gerrit J. J. van den Burg
Gen Suzuki
Wei Liu
Murat Sensoy
ALM
82
0
0
07 Feb 2025
A comparison of translation performance between DeepL and Supertext
A comparison of translation performance between DeepL and Supertext
Alex Flückiger
Chantal Amrhein
Tim Graf
Frédéric Odermatt
Martin Pömsl
Philippe Schläpfer
Florian Schottmann
Samuel Laubli
ELM
45
0
0
04 Feb 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
28
1
0
28 Jan 2025
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
Ran Zhang
Wei Zhao
Steffen Eger
79
4
0
24 Oct 2024
Impact of Model Size on Fine-tuned LLM Performance in Data-to-Text
  Generation: A State-of-the-Art Investigation
Impact of Model Size on Fine-tuned LLM Performance in Data-to-Text Generation: A State-of-the-Art Investigation
Joy Mahapatra
Utpal Garain
47
8
0
19 Jul 2024
AI-Assisted Human Evaluation of Machine Translation
AI-Assisted Human Evaluation of Machine Translation
Vilém Zouhar
Tom Kocmi
Mrinmaya Sachan
51
5
0
18 Jun 2024
Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation
Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation
Boxuan Lyu
Hidetaka Kamigaito
Kotaro Funakoshi
Manabu Okumura
43
0
0
17 Jun 2024
Critical Learning Periods: Leveraging Early Training Dynamics for
  Efficient Data Pruning
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
46
4
0
29 May 2024
What Have We Achieved on Non-autoregressive Translation?
What Have We Achieved on Non-autoregressive Translation?
Yafu Li
Huajian Zhang
Jianhao Yan
Yongjing Yin
Yue Zhang
42
1
0
21 May 2024
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
Minghao Wu
Jiahao Xu
Yulin Yuan
Gholamreza Haffari
Longyue Wang
Weihua Luo
Kaifu Zhang
LLMAG
119
23
0
20 May 2024
Natural Language Processing RELIES on Linguistics
Natural Language Processing RELIES on Linguistics
Juri Opitz
Shira Wein
Nathan Schneider
AI4CE
60
7
0
09 May 2024
Guiding Large Language Models to Post-Edit Machine Translation with
  Error Annotations
Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations
Dayeon Ki
Marine Carpuat
40
17
0
11 Apr 2024
Multi-Dimensional Machine Translation Evaluation: Model Evaluation and
  Resource for Korean
Multi-Dimensional Machine Translation Evaluation: Model Evaluation and Resource for Korean
Dojun Park
Sebastian Padó
45
1
0
19 Mar 2024
Human Evaluation of English--Irish Transformer-Based NMT
Human Evaluation of English--Irish Transformer-Based NMT
Séamus Lankford
Haithem Afli
Andy Way
45
10
0
04 Mar 2024
Likelihood-based Mitigation of Evaluation Bias in Large Language Models
Likelihood-based Mitigation of Evaluation Bias in Large Language Models
Masanari Ohi
Masahiro Kaneko
Ryuto Koike
Mengsay Loem
Naoaki Okazaki
45
4
0
25 Feb 2024
MT-Ranker: Reference-free machine translation evaluation by inter-system
  ranking
MT-Ranker: Reference-free machine translation evaluation by inter-system ranking
Ibraheem Muhammad Moosa
Rui Zhang
Wenpeng Yin
35
5
0
30 Jan 2024
Evaluating Optimal Reference Translations
Evaluating Optimal Reference Translations
Vilém Zouhar
Vvera Kloudová
Martin Popel
Ondrej Bojar
39
2
0
28 Nov 2023
Physician Detection of Clinical Harm in Machine Translation: Quality
  Estimation Aids in Reliance and Backtranslation Identifies Critical Errors
Physician Detection of Clinical Harm in Machine Translation: Quality Estimation Aids in Reliance and Backtranslation Identifies Critical Errors
Nikita Mehandru
Sweta Agrawal
Yimin Xiao
Elaine C. Khoong
Ge Gao
Marine Carpuat
Niloufar Salehi
32
10
0
25 Oct 2023
Quality-Aware Translation Models: Efficient Generation and Quality
  Estimation in a Single Model
Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model
Christian Tomani
David Vilar
Markus Freitag
Colin Cherry
Subhajit Naskar
Mara Finkelstein
Xavier Garcia
Daniel Cremers
26
7
0
10 Oct 2023
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with
  TikZ
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
Jonas Belouadi
Anne Lauscher
Steffen Eger
25
28
0
30 Sep 2023
Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained
  Text Evaluation
Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation
David Heineman
Yao Dou
Wei Xu
32
7
0
14 Aug 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
46
3
0
08 Aug 2023
Efficient Machine Translation Corpus Generation
Efficient Machine Translation Corpus Generation
K. Yuksel
Ahmet Gunduz
Shreyas Sharma
H. Sawaf
34
4
0
20 Jun 2023
BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust
  Machine Translation Evaluation
BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation
T. Glushkova
Chrysoula Zerva
André F. T. Martins
43
6
0
30 May 2023
A Critical Evaluation of Evaluations for Long-form Question Answering
A Critical Evaluation of Evaluations for Long-form Question Answering
Fangyuan Xu
Yixiao Song
Mohit Iyyer
Eunsol Choi
ELM
39
97
0
29 May 2023
Leveraging GPT-4 for Automatic Translation Post-Editing
Leveraging GPT-4 for Automatic Translation Post-Editing
Vikas Raunak
Amr Sharaf
Yiren Wang
H. Awadallah
Arul Menezes
18
62
0
24 May 2023
Towards Unsupervised Recognition of Token-level Semantic Differences in
  Related Documents
Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents
Jannis Vamvas
Rico Sennrich
29
2
0
22 May 2023
PaLM 2 Technical Report
PaLM 2 Technical Report
Rohan Anil
Andrew M. Dai
Orhan Firat
Melvin Johnson
Dmitry Lepikhin
...
Ce Zheng
Wei Zhou
Denny Zhou
Slav Petrov
Yonghui Wu
ReLM
LRM
128
1,152
0
17 May 2023
Angler: Helping Machine Translation Practitioners Prioritize Model
  Improvements
Angler: Helping Machine Translation Practitioners Prioritize Model Improvements
Samantha Robertson
Zijie J. Wang
Dominik Moritz
Mary Beth Kery
Fred Hohman
38
15
0
12 Apr 2023
Large language models effectively leverage document-level context for
  literary translation, but critical errors persist
Large language models effectively leverage document-level context for literary translation, but critical errors persist
Marzena Karpinska
Mohit Iyyer
38
82
0
06 Apr 2023
Error Analysis Prompting Enables Human-Like Translation Evaluation in
  Large Language Models
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models
Qingyu Lu
Baopu Qiu
Liang Ding
Liping Xie
Tom Kocmi
Dacheng Tao
LRM
ALM
ELM
31
108
0
24 Mar 2023
Large Language Models Are State-of-the-Art Evaluators of Translation
  Quality
Large Language Models Are State-of-the-Art Evaluators of Translation Quality
Tom Kocmi
C. Federmann
ELM
57
341
0
28 Feb 2023
Towards Fine-Grained Information: Identifying the Type and Location of
  Translation Errors
Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors
Keqin Bao
Boyi Deng
Dayiheng Liu
Baosong Yang
Wenqiang Lei
Xiangnan He
Derek F.Wong
Jun Xie
42
4
0
17 Feb 2023
The unreasonable effectiveness of few-shot learning for machine
  translation
The unreasonable effectiveness of few-shot learning for machine translation
Xavier Garcia
Yamini Bansal
Colin Cherry
George F. Foster
M. Krikun
Fan Feng
Melvin Johnson
Orhan Firat
40
103
0
02 Feb 2023
BMX: Boosting Natural Language Generation Metrics with Explainability
BMX: Boosting Natural Language Generation Metrics with Explainability
Christoph Leiter
Hoang-Quan Nguyen
Steffen Eger
ELM
24
0
0
20 Dec 2022
Extrinsic Evaluation of Machine Translation Metrics
Extrinsic Evaluation of Machine Translation Metrics
Nikita Moghe
Tom Sherborne
Mark Steedman
Alexandra Birch
ELM
31
18
0
20 Dec 2022
Toward Human-Like Evaluation for Natural Language Generation with Error
  Analysis
Toward Human-Like Evaluation for Natural Language Generation with Error Analysis
Qingyu Lu
Liang Ding
Liping Xie
Kanjian Zhang
Derek F. Wong
Dacheng Tao
ELM
ALM
36
14
0
20 Dec 2022
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su
Weijia Shi
Jungo Kasai
Yizhong Wang
Yushi Hu
Mari Ostendorf
Wen-tau Yih
Noah A. Smith
Luke Zettlemoyer
Tao Yu
27
282
0
19 Dec 2022
Improving Simultaneous Machine Translation with Monolingual Data
Improving Simultaneous Machine Translation with Monolingual Data
Hexuan Deng
Liang Ding
Xuebo Liu
Meishan Zhang
Dacheng Tao
Min Zhang
40
12
0
02 Dec 2022
Operationalizing Specifications, In Addition to Test Sets for Evaluating
  Constrained Generative Models
Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models
Vikas Raunak
Matt Post
Arul Menezes
EGVM
37
0
0
19 Nov 2022
Prompting PaLM for Translation: Assessing Strategies and Performance
Prompting PaLM for Translation: Assessing Strategies and Performance
David Vilar
Markus Freitag
Colin Cherry
Jiaming Luo
Viresh Ratnakar
George F. Foster
LRM
32
155
0
16 Nov 2022
12
Next