Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.07967
Cited By
Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling
12 June 2024
Jie Ruan
Xiao Pu
Mingqi Gao
Xiaojun Wan
Yuesheng Zhu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling"
9 / 9 papers shown
Title
We need to talk about random seeds
Steven Bethard
41
8
0
24 Oct 2022
Transparent Human Evaluation for Image Captioning
Jungo Kasai
Keisuke Sakaguchi
Lavinia Dunagan
Jacob Morrison
Ronan Le Bras
Yejin Choi
Noah A. Smith
43
49
0
17 Nov 2021
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
84
829
0
22 Jun 2021
Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort
Vania Mendoncca
Ricardo Rei
Luísa Coheur
Alberto Sardinha
Ana Lúcia Santos INESC-ID Lisboa
29
6
0
27 May 2021
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Jian Guan
Zhexin Zhang
Zhuoer Feng
Zitao Liu
Wenbiao Ding
Xiaoxi Mao
Changjie Fan
Minlie Huang
40
61
0
19 May 2021
Re-evaluating Evaluation in Text Summarization
Manik Bhandari
Pranav Narayan Gour
A. Ashfaq
Pengfei Liu
Graham Neubig
83
175
0
14 Oct 2020
Unifying Human and Statistical Evaluation for Natural Language Generation
Tatsunori B. Hashimoto
Hugh Zhang
Percy Liang
49
223
0
04 Apr 2019
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
Max Grusky
Mor Naaman
Yoav Artzi
74
550
0
30 Apr 2018
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
60
814
0
29 Mar 2017
1