ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.07967
  4. Cited By
Better than Random: Reliable NLG Human Evaluation with Constrained
  Active Sampling

Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling

12 June 2024
Jie Ruan
Xiao Pu
Mingqi Gao
Xiaojun Wan
Yuesheng Zhu
ArXivPDFHTML

Papers citing "Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling"

9 / 9 papers shown
Title
We need to talk about random seeds
We need to talk about random seeds
Steven Bethard
41
8
0
24 Oct 2022
Transparent Human Evaluation for Image Captioning
Transparent Human Evaluation for Image Captioning
Jungo Kasai
Keisuke Sakaguchi
Lavinia Dunagan
Jacob Morrison
Ronan Le Bras
Yejin Choi
Noah A. Smith
43
49
0
17 Nov 2021
BARTScore: Evaluating Generated Text as Text Generation
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
84
829
0
22 Jun 2021
Online Learning Meets Machine Translation Evaluation: Finding the Best
  Systems with the Least Human Effort
Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort
Vania Mendoncca
Ricardo Rei
Luísa Coheur
Alberto Sardinha
Ana Lúcia Santos INESC-ID Lisboa
29
6
0
27 May 2021
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Jian Guan
Zhexin Zhang
Zhuoer Feng
Zitao Liu
Wenbiao Ding
Xiaoxi Mao
Changjie Fan
Minlie Huang
40
61
0
19 May 2021
Re-evaluating Evaluation in Text Summarization
Re-evaluating Evaluation in Text Summarization
Manik Bhandari
Pranav Narayan Gour
A. Ashfaq
Pengfei Liu
Graham Neubig
83
175
0
14 Oct 2020
Unifying Human and Statistical Evaluation for Natural Language
  Generation
Unifying Human and Statistical Evaluation for Natural Language Generation
Tatsunori B. Hashimoto
Hugh Zhang
Percy Liang
49
223
0
04 Apr 2019
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive
  Strategies
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
Max Grusky
Mor Naaman
Yoav Artzi
74
550
0
30 Apr 2018
Survey of the State of the Art in Natural Language Generation: Core
  tasks, applications and evaluation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
60
814
0
29 Mar 2017
1