ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.02622
  4. Cited By
MoverScore: Text Generation Evaluating with Contextualized Embeddings
  and Earth Mover Distance

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

5 September 2019
Wei Zhao
Maxime Peyrard
Fei Liu
Yang Gao
Christian M. Meyer
Steffen Eger
ArXivPDFHTML

Papers citing "MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance"

50 / 165 papers shown
Title
Towards Better Evaluation for Generated Patent Claims
Towards Better Evaluation for Generated Patent Claims
Lekang Jiang
Pascal A Scherz
Stephan Goetz
ELM
30
0
0
16 May 2025
SEval-Ex: A Statement-Level Framework for Explainable Summarization Evaluation
SEval-Ex: A Statement-Level Framework for Explainable Summarization Evaluation
Tanguy Herserant
Vincent Guigue
ELM
45
0
0
04 May 2025
Summarization Metrics for Spanish and Basque: Do Automatic Scores and LLM-Judges Correlate with Humans?
Summarization Metrics for Spanish and Basque: Do Automatic Scores and LLM-Judges Correlate with Humans?
Jeremy Barnes
Naiara Perez
Alba Bonet-Jover
Begoña Altuna
67
1
0
21 Mar 2025
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi
Eddy Ilg
Margret Keuper
Hideki Tanaka
Masao Utiyama
Raj Dabre
Steffen Eger
Simone Paolo Ponzetto
54
0
0
14 Mar 2025
Argument Summarization and its Evaluation in the Era of Large Language Models
Argument Summarization and its Evaluation in the Era of Large Language Models
Moritz Altemeyer
Steffen Eger
Johannes Daxenberger
Tim Altendorf
Philipp Cimiano
Benjamin Schiller
LM&MA
ELM
LRM
72
0
0
02 Mar 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
SeongYeub Chu
JongWoo Kim
MunYong Yi
65
3
0
21 Feb 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
33
1
0
28 Jan 2025
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps
Henry Li
Ronen Basri
Y. Kluger
DiffM
64
2
0
13 Jan 2025
EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
Mengna Zhu
Kaisheng Zeng
Mao Wang
Kaiming Xiao
Lei Hou
Hongbin Huang
Juanzi Li
300
1
0
16 Dec 2024
Bridging the Gap between Expert and Language Models: Concept-guided Chess Commentary Generation and Evaluation
Bridging the Gap between Expert and Language Models: Concept-guided Chess Commentary Generation and Evaluation
Jaechang Kim
Jinmin Goh
Inseok Hwang
Jaewoong Cho
Jungseul Ok
ELM
35
1
0
28 Oct 2024
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts
German Gritsai
Anastasia Voznyuk
Andrey Grabovoy
Yury Chekhovich
DeLMO
84
1
0
18 Oct 2024
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
42
4
0
07 Oct 2024
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
50
9
0
03 Oct 2024
Impact of Model Size on Fine-tuned LLM Performance in Data-to-Text
  Generation: A State-of-the-Art Investigation
Impact of Model Size on Fine-tuned LLM Performance in Data-to-Text Generation: A State-of-the-Art Investigation
Joy Mahapatra
Utpal Garain
50
8
0
19 Jul 2024
MINDECHO: Role-Playing Language Agents for Key Opinion Leaders
MINDECHO: Role-Playing Language Agents for Key Opinion Leaders
Rui Xu
Dakuan Lu
Jue Chen
Xintao Wang
Siyu Yuan
Jiangjie Chen
Wei Chu
Xu Yinghui
LLMAG
39
3
0
07 Jul 2024
FineSurE: Fine-grained Summarization Evaluation using LLMs
FineSurE: Fine-grained Summarization Evaluation using LLMs
Hwanjun Song
Hang Su
Igor Shalyminov
Jason (Jinglun) Cai
Saab Mansour
HILM
43
32
0
01 Jul 2024
RepEval: Effective Text Evaluation with LLM Representation
RepEval: Effective Text Evaluation with LLM Representation
Shuqian Sheng
Yi Xu
Tianhang Zhang
Zanwei Shen
Luoyi Fu
Jiaxin Ding
Lei Zhou
Xinbing Wang
Cheng Zhou
43
2
0
30 Apr 2024
Edisum: Summarizing and Explaining Wikipedia Edits at Scale
Edisum: Summarizing and Explaining Wikipedia Edits at Scale
Marija Sakota
Isaac Johnson
Guosheng Feng
Robert West
SyDa
KELM
48
2
0
04 Apr 2024
A Rationale-centric Counterfactual Data Augmentation Method for
  Cross-Document Event Coreference Resolution
A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution
Bowen Ding
Qingkai Min
Shengkun Ma
Yingjie Li
Linyi Yang
Yue Zhang
46
4
0
02 Apr 2024
Polos: Multimodal Metric Learning from Human Feedback for Image
  Captioning
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
Yuiga Wada
Kanta Kaneda
Daichi Saito
Komei Sugiura
39
25
0
28 Feb 2024
Style-News: Incorporating Stylized News Generation and Adversarial
  Verification for Neural Fake News Detection
Style-News: Incorporating Stylized News Generation and Adversarial Verification for Neural Fake News Detection
Wei-Yao Wang
Yu-Chieh Chang
Wenjie Peng
30
0
0
27 Jan 2024
LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent
  Sentence Spaces
LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent Sentence Spaces
Yingji Zhang
Danilo S. Carvalho
Ian Pratt-Hartmann
André Freitas
VLM
40
2
0
20 Dec 2023
LLMEval: A Preliminary Study on How to Evaluate Large Language Models
LLMEval: A Preliminary Study on How to Evaluate Large Language Models
Yue Zhang
Ming Zhang
Haipeng Yuan
Shichun Liu
Yongyao Shi
Tao Gui
Qi Zhang
Xuanjing Huang
ALM
ELM
24
11
0
12 Dec 2023
Interpretation modeling: Social grounding of sentences by reasoning over
  their implicit moral judgments
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments
Liesbeth Allein
Maria Mihaela Trucscva
Marie-Francine Moens
55
1
0
27 Nov 2023
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented
  Instruction Tuning with Auxiliary Evaluation Aspects
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation Aspects
Minqian Liu
Ying Shen
Zhiyang Xu
Yixin Cao
Eunah Cho
Vaibhav Kumar
Reza Ghanadan
Lifu Huang
ELM
LM&MA
ALM
57
25
0
15 Nov 2023
Towards Effective Paraphrasing for Information Disguise
Towards Effective Paraphrasing for Information Disguise
Anmol Agarwal
Shrey Gupta
Vamshi Krishna Bonagiri
Manas Gaur
Joseph M. Reagle
Ponnurangam Kumaraguru
43
3
0
08 Nov 2023
Evaluating Generative Ad Hoc Information Retrieval
Evaluating Generative Ad Hoc Information Retrieval
Lukas Gienapp
Harrisen Scells
Niklas Deckers
Janek Bevendorff
Shuai Wang
...
Maik Fröbe
Guide Zucoon
Benno Stein
Matthias Hagen
Martin Potthast
RALM
55
11
0
08 Nov 2023
OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization
OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization
Yuchen Shen
Xiaojun Wan
40
9
0
27 Oct 2023
CoheSentia: A Novel Benchmark of Incremental versus Holistic Assessment
  of Coherence in Generated Texts
CoheSentia: A Novel Benchmark of Incremental versus Holistic Assessment of Coherence in Generated Texts
Aviya Maimon
Reut Tsarfaty
21
6
0
25 Oct 2023
Tuna: Instruction Tuning using Feedback from Large Language Models
Tuna: Instruction Tuning using Feedback from Large Language Models
Haoran Li
Yiran Liu
Xingxing Zhang
Wei Lu
Furu Wei
ALM
41
3
0
20 Oct 2023
Surveying the Landscape of Text Summarization with Deep Learning: A
  Comprehensive Review
Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review
Guanghua Wang
Weili Wu
AI4TS
AILaw
43
4
0
13 Oct 2023
Learning Personalized Alignment for Evaluating Open-ended Text
  Generation
Learning Personalized Alignment for Evaluating Open-ended Text Generation
Danqing Wang
Kevin Kaichuang Yang
Hanlin Zhu
Xiaomeng Yang
Andrew Cohen
Lei Li
Yuandong Tian
ALM
LM&MA
28
8
0
05 Oct 2023
Ragas: Automated Evaluation of Retrieval Augmented Generation
Ragas: Automated Evaluation of Retrieval Augmented Generation
ES Shahul
Jithin James
Luis Espinosa-Anke
Steven Schockaert
91
179
0
26 Sep 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare
  Conversations Powered by Generative AI
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI
Mahyar Abbasian
Elahe Khatibi
Iman Azimi
David Oniani
Zahra Shakeri Hossein Abad
...
Bryant Lin
Olivier Gevaert
Li-Jia Li
Ramesh C. Jain
Amir M. Rahmani
LM&MA
ELM
AI4MH
50
66
0
21 Sep 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
51
3
0
08 Aug 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
37
54
0
27 Jul 2023
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed
  Question Answering
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering
Pei Ke
Fei Huang
Fei Mi
Yasheng Wang
Qun Liu
Xiaoyan Zhu
Minlie Huang
ReLM
ELM
52
10
0
13 Jul 2023
Hybrid Long Document Summarization using C2F-FAR and ChatGPT: A
  Practical Study
Hybrid Long Document Summarization using C2F-FAR and ChatGPT: A Practical Study
Guang Lu
Sylvia B. Larcher
Tu-Anh Tran
36
9
0
01 Jun 2023
A Practical Toolkit for Multilingual Question and Answer Generation
A Practical Toolkit for Multilingual Question and Answer Generation
Asahi Ushio
Fernando Alva-Manchego
Jose Camacho-Collados
SyDa
38
14
0
27 May 2023
UMSE: Unified Multi-scenario Summarization Evaluation
UMSE: Unified Multi-scenario Summarization Evaluation
Shen Gao
Zhitao Yao
Chongyang Tao
Preslav Nakov
Pengjie Ren
Zhaochun Ren
Zhumin Chen
45
5
0
26 May 2023
Element-aware Summarization with Large Language Models: Expert-aligned
  Evaluation and Chain-of-Thought Method
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method
Yiming Wang
Zhuosheng Zhang
Rui Wang
48
81
0
22 May 2023
Towards More Robust NLP System Evaluation: Handling Missing Scores in
  Benchmarks
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks
Anas Himmi
Ekhine Irurozki
Nathan Noiry
Stéphan Clémençon
Pierre Colombo
40
5
0
17 May 2023
ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness
ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness
Archiki Prasad
Swarnadeep Saha
Xiang Zhou
Joey Tianyi Zhou
LRM
34
46
0
21 Apr 2023
An Empirical Study of Multitask Learning to Improve Open Domain Dialogue
  Systems
An Empirical Study of Multitask Learning to Improve Open Domain Dialogue Systems
M. Farahani
Richard Johansson
24
0
0
17 Apr 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Yang Liu
Dan Iter
Yichong Xu
Shuohang Wang
Ruochen Xu
Chenguang Zhu
ELM
ALM
LM&MA
98
1,090
0
29 Mar 2023
Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on
  Consistency with Human Preferences
Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
Yunjie Ji
Yan Gong
Yiping Peng
Chao Ni
Peiyan Sun
Dongyu Pan
Baochang Ma
Xiangang Li
ELM
ALM
AI4MH
32
37
0
14 Mar 2023
Is ChatGPT a Good NLG Evaluator? A Preliminary Study
Is ChatGPT a Good NLG Evaluator? A Preliminary Study
Jiaan Wang
Yunlong Liang
Fandong Meng
Zengkui Sun
Haoxiang Shi
Zhixu Li
Jinan Xu
Jianfeng Qu
Jie Zhou
LM&MA
ELM
ALM
AI4MH
67
449
0
07 Mar 2023
Towards Interpretable and Efficient Automatic Reference-Based
  Summarization Evaluation
Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation
Yixin Liu
Alexander R. Fabbri
Yilun Zhao
Pengfei Liu
Chenyu You
Chien-Sheng Wu
Caiming Xiong
Dragomir R. Radev
17
28
0
07 Mar 2023
Factual Consistency Oriented Speech Recognition
Factual Consistency Oriented Speech Recognition
Naoyuki Kanda
Takuya Yoshioka
Yang Liu
45
0
0
24 Feb 2023
Summaries as Captions: Generating Figure Captions for Scientific
  Documents with Automated Text Summarization
Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization
Huang Chieh-Yang
Ting-Yao Hsu
Ryan A. Rossi
A. Nenkova
Sungchul Kim
G. Chan
Eunyee Koh
C. Lee Giles
Ting-Hao 'Kenneth' Huang
22
16
0
23 Feb 2023
1234
Next