ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.09204
  4. Cited By
Fusion-Eval: Integrating Assistant Evaluators with LLMs

Fusion-Eval: Integrating Assistant Evaluators with LLMs

15 November 2023
Lei Shu
Nevan Wichers
Liangchen Luo
Yun Zhu
Yinxiao Liu
Jindong Chen
Lei Meng
    ELM
ArXivPDFHTML

Papers citing "Fusion-Eval: Integrating Assistant Evaluators with LLMs"

12 / 12 papers shown
Title
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
158
39
0
02 Feb 2024
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization
  Evaluation
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation
Elizabeth Clark
Shruti Rijhwani
Sebastian Gehrmann
Joshua Maynez
Roee Aharoni
Vitaly Nikolaev
Thibault Sellam
Aditya Siddhant
Dipanjan Das
Ankur P. Parikh
63
41
0
22 May 2023
Is ChatGPT a Good NLG Evaluator? A Preliminary Study
Is ChatGPT a Good NLG Evaluator? A Preliminary Study
Jiaan Wang
Yunlong Liang
Fandong Meng
Zengkui Sun
Haoxiang Shi
Zhixu Li
Jinan Xu
Jianfeng Qu
Jie Zhou
LM&MA
ELM
ALM
AI4MH
118
466
0
07 Mar 2023
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Ming Zhong
Yang Liu
Da Yin
Yuning Mao
Yizhu Jiao
Peng Liu
Chenguang Zhu
Heng Ji
Jiawei Han
ELM
74
272
0
13 Oct 2022
SMART: Sentences as Basic Units for Text Evaluation
SMART: Sentences as Basic Units for Text Evaluation
Reinald Kim Amplayo
Peter J. Liu
Yao-Min Zhao
Shashi Narayan
60
22
0
01 Aug 2022
BARTScore: Evaluating Generated Text as Text Generation
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
104
843
0
22 Jun 2021
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog
  Generation
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
Shikib Mehri
M. Eskénazi
65
224
0
01 May 2020
Asking and Answering Questions to Evaluate the Factual Consistency of
  Summaries
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
Alex Jinpeng Wang
Kyunghyun Cho
M. Lewis
HILM
84
481
0
08 Apr 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
427
20,181
0
23 Oct 2019
MoverScore: Text Generation Evaluating with Contextualized Embeddings
  and Earth Mover Distance
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Wei Zhao
Maxime Peyrard
Fei Liu
Yang Gao
Christian M. Meyer
Steffen Eger
171
598
0
05 Sep 2019
BERTScore: Evaluating Text Generation with BERT
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
324
5,814
0
21 Apr 2019
A large annotated corpus for learning natural language inference
A large annotated corpus for learning natural language inference
Samuel R. Bowman
Gabor Angeli
Christopher Potts
Christopher D. Manning
313
4,287
0
21 Aug 2015
1