ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.11574
  4. Cited By
A global analysis of metrics used for measuring performance in natural
  language processing

A global analysis of metrics used for measuring performance in natural language processing

25 April 2022
Kathrin Blagec
Georg Dorffner
M. Moradi
Simon Ott
Matthias Samwald
ArXivPDFHTML

Papers citing "A global analysis of metrics used for measuring performance in natural language processing"

12 / 12 papers shown
Title
How many preprints have actually been printed and why: a case study of
  computer science preprints on arXiv
How many preprints have actually been printed and why: a case study of computer science preprints on arXiv
Jialiang Lin
Yao Yu
Yu Zhou
Zhiyang Zhou
Xiaodon Shi
AI4CE
25
47
0
03 Aug 2023
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Lavinia Dunagan
Jacob Morrison
Alexander R. Fabbri
Yejin Choi
Noah A. Smith
68
40
0
08 Dec 2021
A curated, ontology-based, large-scale knowledge graph of artificial
  intelligence tasks and benchmarks
A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks
Kathrin Blagec
A. Barbosa-Silva
Simon Ott
Matthias Samwald
44
26
0
04 Oct 2021
Scientific Credibility of Machine Translation Research: A
  Meta-Evaluation of 769 Papers
Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers
Benjamin Marie
Atsushi Fujita
Raphaël Rubino
ELM
34
104
0
29 Jun 2021
Dynaboard: An Evaluation-As-A-Service Platform for Holistic
  Next-Generation Benchmarking
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking
Zhiyi Ma
Kawin Ethayarajh
Tristan Thrush
Somya Jain
Ledell Yu Wu
Robin Jia
Christopher Potts
Adina Williams
Douwe Kiela
ELM
72
57
0
21 May 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
300
286
0
02 Feb 2021
BERTScore: Evaluating Text Generation with BERT
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
279
5,764
0
21 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.5K
94,511
0
11 Oct 2018
ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization
  Tasks
ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks
Kavita A. Ganesan
34
148
0
05 Mar 2018
Deep contextualized word representations
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
188
11,542
0
15 Feb 2018
Why We Need New Evaluation Metrics for NLG
Why We Need New Evaluation Metrics for NLG
Jekaterina Novikova
Ondrej Dusek
Amanda Cercas Curry
Verena Rieser
73
459
0
21 Jul 2017
Better Summarization Evaluation with Word Embeddings for ROUGE
Better Summarization Evaluation with Word Embeddings for ROUGE
Jun-Ping Ng
Viktoria Abrecht
53
171
0
25 Aug 2015
1