Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.08920
Cited By
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
19 May 2021
Jian Guan
Zhexin Zhang
Zhuoer Feng
Zitao Liu
Wenbiao Ding
Xiaoxi Mao
Changjie Fan
Minlie Huang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics"
20 / 20 papers shown
Title
Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?
Grgur Kovač
Jérémy Perez
Rémy Portelas
Peter Ford Dominey
Pierre-Yves Oudeyer
35
0
0
04 Apr 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
SeongYeub Chu
JongWoo Kim
MunYong Yi
60
3
0
21 Feb 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
28
1
0
28 Jan 2025
4-LEGS: 4D Language Embedded Gaussian Splatting
Gal Fiebelman
Tamir Cohen
Ayellet Morgenstern
Peter Hedman
Hadar Averbuch-Elor
3DGS
46
3
0
14 Oct 2024
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
31
4
0
07 Oct 2024
Likelihood-based Mitigation of Evaluation Bias in Large Language Models
Masanari Ohi
Masahiro Kaneko
Ryuto Koike
Mengsay Loem
Naoaki Okazaki
40
4
0
25 Feb 2024
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention
Kaiqiang Song
Xiaoyang Wang
Sangwoo Cho
Xiaoman Pan
Dong Yu
34
7
0
14 Dec 2023
Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts
Christina Chance
Da Yin
Dakuo Wang
Kai-Wei Chang
34
0
0
16 Oct 2023
Learning Personalized Alignment for Evaluating Open-ended Text Generation
Danqing Wang
Kevin Kaichuang Yang
Hanlin Zhu
Xiaomeng Yang
Andrew Cohen
Lei Li
Yuandong Tian
ALM
LM&MA
20
8
0
05 Oct 2023
Diversifying Question Generation over Knowledge Base via External Natural Questions
Shasha Guo
Jing Zhang
Xirui Ke
Cuiping Li
Hong Chen
42
3
0
23 Sep 2023
Is ChatGPT a Good NLG Evaluator? A Preliminary Study
Jiaan Wang
Yunlong Liang
Fandong Meng
Zengkui Sun
Haoxiang Shi
Zhixu Li
Jinan Xu
Jianfeng Qu
Jie Zhou
LM&MA
ELM
ALM
AI4MH
62
446
0
07 Mar 2023
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey
Yuxin Wang
Jieru Lin
Zhiwei Yu
Wei Hu
Börje F. Karlsson
20
17
0
09 Dec 2022
StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning
Hong Chen
D. Vo
Hiroya Takamura
Yusuke Miyao
Hideki Nakayama
27
20
0
16 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
121
94
0
06 Oct 2022
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation
Cyril Chhun
Pierre Colombo
Chloé Clavel
Fabian M. Suchanek
53
50
0
24 Aug 2022
CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation
Pei Ke
Hao Zhou
Yankai Lin
Peng Li
Jie Zhou
Xiaoyan Zhu
Minlie Huang
21
37
0
02 Apr 2022
Rethinking and Refining the Distinct Metric
Siyang Liu
Sahand Sabour
Yinhe Zheng
Pei Ke
Xiaoyan Zhu
Minlie Huang
36
11
0
28 Feb 2022
A Temporal Variational Model for Story Generation
David Wilmot
Frank Keller
DRL
35
8
0
14 Sep 2021
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation
Jian Guan
Zhuoer Feng
Yamei Chen
Ru He
Xiaoxi Mao
Changjie Fan
Minlie Huang
39
32
0
30 Aug 2021
Adversarial Evaluation of Dialogue Models
Anjuli Kannan
Oriol Vinyals
AAML
ALM
141
76
0
27 Jan 2017
1