ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.04392
  4. Cited By
The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and
  Results

The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results

8 October 2021
M. Fomicheva
Piyawat Lertvittayakumjorn
Wei-Ye Zhao
Steffen Eger
Yang Gao
    ELM
ArXivPDFHTML

Papers citing "The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results"

26 / 26 papers shown
Title
AskQE: Question Answering as Automatic Evaluation for Machine Translation
AskQE: Question Answering as Automatic Evaluation for Machine Translation
Dayeon Ki
Kevin Duh
Marine Carpuat
24
0
0
15 Apr 2025
QE4PE: Word-level Quality Estimation for Human Post-Editing
Gabriele Sarti
Vilém Zouhar
Grzegorz Chrupała
Ana Guerberof Arenas
Malvina Nissim
Arianna Bisazza
41
0
0
04 Mar 2025
Explanation Regularisation through the Lens of Attributions
Explanation Regularisation through the Lens of Attributions
Pedro Ferreira
Wilker Aziz
Ivan Titov
43
1
0
23 Jul 2024
AI-Assisted Human Evaluation of Machine Translation
AI-Assisted Human Evaluation of Machine Translation
Vilém Zouhar
Tom Kocmi
Mrinmaya Sachan
48
5
0
18 Jun 2024
Word-Level ASR Quality Estimation for Efficient Corpus Sampling and
  Post-Editing through Analyzing Attentions of a Reference-Free Metric
Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric
Golara Javadi
K. Yuksel
Yunsu Kim
Thiago Castro Ferreira
Mohamed Al-Badrashiny
19
2
0
20 Jan 2024
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced
  African Languages
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
Jiayi Wang
David Ifeoluwa Adelani
Sweta Agrawal
Marek Masiak
Ricardo Rei
...
V. Otiende
C. Mbonu
Sakayo Toadoum Sari
Yao Lu
Pontus Stenetorp
23
6
0
16 Nov 2023
The Eval4NLP 2023 Shared Task on Prompting Large Language Models as
  Explainable Metrics
The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Christoph Leiter
Juri Opitz
Daniel Deutsch
Yang Gao
Rotem Dror
Steffen Eger
ALM
LRM
ELM
37
31
0
30 Oct 2023
Towards Explainable Evaluation Metrics for Machine Translation
Towards Explainable Evaluation Metrics for Machine Translation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei-Ye Zhao
Yang Gao
Steffen Eger
ELM
28
13
0
22 Jun 2023
RAMP: Retrieval and Attribute-Marking Enhanced Prompting for
  Attribute-Controlled Translation
RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation
Gabriele Sarti
Phu Mon Htut
Xing Niu
B. Hsu
Anna Currey
Georgiana Dinu
Maria Nadejde
LRM
39
9
0
26 May 2023
The Inside Story: Towards Better Understanding of Machine Translation
  Neural Evaluation Metrics
The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics
Ricardo Rei
Nuno M. Guerreiro
Marcos Vinícius Treviso
Luísa Coheur
A. Lavie
André F.T. Martins
27
15
0
19 May 2023
Perturbation-based QE: An Explainable, Unsupervised Word-level Quality
  Estimation Method for Blackbox Machine Translation
Perturbation-based QE: An Explainable, Unsupervised Word-level Quality Estimation Method for Blackbox Machine Translation
Tu Anh Dinh
J. Niehues
24
5
0
12 May 2023
BMX: Boosting Natural Language Generation Metrics with Explainability
BMX: Boosting Natural Language Generation Metrics with Explainability
Christoph Leiter
Hoang-Quan Nguyen
Steffen Eger
ELM
18
0
0
20 Dec 2022
Extrinsic Evaluation of Machine Translation Metrics
Extrinsic Evaluation of Machine Translation Metrics
Nikita Moghe
Tom Sherborne
Mark Steedman
Alexandra Birch
ELM
26
18
0
20 Dec 2022
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue
  Systems
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems
Songbo Hu
Ivan Vulić
Fangyu Liu
Anna Korhonen
39
0
0
07 Nov 2022
DEMETR: Diagnosing Evaluation Metrics for Translation
DEMETR: Diagnosing Evaluation Metrics for Translation
Marzena Karpinska
N. Raj
Katherine Thai
Yixiao Song
Ankita Gupta
Mohit Iyyer
26
37
0
25 Oct 2022
EffEval: A Comprehensive Evaluation of Efficiency for MT Evaluation
  Metrics
EffEval: A Comprehensive Evaluation of Efficiency for MT Evaluation Metrics
Daniil Larionov
Jens Grunwald
Christoph Leiter
Steffen Eger
20
5
0
20 Sep 2022
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared
  Task
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task
Ricardo Rei
Marcos Vinícius Treviso
Nuno M. Guerreiro
Chrysoula Zerva
Ana C. Farinha
...
T. Glushkova
Duarte M. Alves
A. Lavie
Luísa Coheur
André F. T. Martins
60
138
0
13 Sep 2022
Rethink about the Word-level Quality Estimation for Machine Translation
  from Human Judgement
Rethink about the Word-level Quality Estimation for Machine Translation from Human Judgement
Zhen Yang
Fandong Meng
Yuanmeng Yan
Jie Zhou
26
3
0
13 Sep 2022
Learning to Scaffold: Optimizing Model Explanations for Teaching
Learning to Scaffold: Optimizing Model Explanations for Teaching
Patrick Fernandes
Marcos Vinícius Treviso
Danish Pruthi
André F. T. Martins
Graham Neubig
FAtt
25
22
0
22 Apr 2022
Towards Explainable Evaluation Metrics for Natural Language Generation
Towards Explainable Evaluation Metrics for Natural Language Generation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei-Ye Zhao
Yang Gao
Steffen Eger
AAML
ELM
24
20
0
21 Mar 2022
Report from the NSF Future Directions Workshop on Automatic Evaluation
  of Dialog: Research Directions and Challenges
Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges
Shikib Mehri
Jinho Choi
L. F. D’Haro
Jan Deriu
M. Eskénazi
...
David Traum
Yi-Ting Yeh
Zhou Yu
Yizhe Zhang
Chen Zhang
30
21
0
18 Mar 2022
USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics
  for Machine Translation
USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics for Machine Translation
Jonas Belouadi
Steffen Eger
31
20
0
21 Feb 2022
DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence
DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence
Wei-Ye Zhao
Michael Strube
Steffen Eger
21
37
0
26 Jan 2022
Global Explainability of BERT-Based Evaluation Metrics by Disentangling
  along Linguistic Factors
Global Explainability of BERT-Based Evaluation Metrics by Disentangling along Linguistic Factors
Marvin Kaster
Wei-Ye Zhao
Steffen Eger
25
24
0
08 Oct 2021
MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset
MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset
M. Fomicheva
Shuo Sun
E. Fonseca
Chrysoula Zerva
Frédéric Blain
Vishrav Chaudhary
Francisco Guzmán
Nina Lopatina
Lucia Specia
André F. T. Martins
19
67
0
09 Oct 2020
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
254
3,684
0
28 Feb 2017
1