ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.12693
  4. Cited By
QuestEval: Summarization Asks for Fact-based Evaluation

QuestEval: Summarization Asks for Fact-based Evaluation

23 March 2021
Thomas Scialom
Paul-Alexis Dray
Patrick Gallinari
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
Alex Jinpeng Wang
    HILM
ArXivPDFHTML

Papers citing "QuestEval: Summarization Asks for Fact-based Evaluation"

50 / 181 papers shown
Title
Narrating Causal Graphs with Large Language Models
Narrating Causal Graphs with Large Language Models
Atharva Phatak
Vijay Mago
Ameeta Agrawal
Aravind Inbasekaran
Philippe J. Giabbanelli
AI4CE
56
3
0
11 Mar 2024
On the Benefits of Fine-Grained Loss Truncation: A Case Study on
  Factuality in Summarization
On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization
Lorenzo Jaime Yu Flores
Arman Cohan
HILM
46
2
0
09 Mar 2024
ROUGE-K: Do Your Summaries Have Keywords?
ROUGE-K: Do Your Summaries Have Keywords?
Sotaro Takeshita
Simone Paolo Ponzetto
Kai Eckert
24
0
0
08 Mar 2024
Book2Dial: Generating Teacher-Student Interactions from Textbooks for
  Cost-Effective Development of Educational Chatbots
Book2Dial: Generating Teacher-Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots
Junling Wang
Jakub Macina
Nico Daheim
Sankalan Pal Chowdhury
Mrinmaya Sachan
37
8
0
05 Mar 2024
FENICE: Factuality Evaluation of summarization based on Natural language
  Inference and Claim Extraction
FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction
Alessandro Sciré
Karim Ghonim
Roberto Navigli
HILM
29
8
0
04 Mar 2024
Fine-Grained Natural Language Inference Based Faithfulness Evaluation
  for Diverse Summarisation Tasks
Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks
Huajian Zhang
Yumo Xu
Laura Perez-Beltrachini
HILM
34
10
0
27 Feb 2024
Entity-level Factual Adaptiveness of Fine-tuning based Abstractive
  Summarization Models
Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models
Jongyoon Song
Nohil Park
Bongkyu Hwang
Jaewoong Yun
Seongho Joe
Youngjune Gwon
Sungroh Yoon
KELM
HILM
38
1
0
23 Feb 2024
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Preslav Nakov
Tairan Wang
Qingqing Zhu
Taicheng Guo
Shen Gao
Zhiyong Lu
Xin Gao
Xiangliang Zhang
80
2
0
22 Feb 2024
Factual consistency evaluation of summarization in the Era of large language models
Factual consistency evaluation of summarization in the Era of large language models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
HILM
35
1
0
21 Feb 2024
Identifying Factual Inconsistencies in Summaries: Grounding Model
  Inference via Task Taxonomy
Identifying Factual Inconsistencies in Summaries: Grounding Model Inference via Task Taxonomy
Liyan Xu
Zhenlin Su
Mo Yu
Jin Xu
Jinho D. Choi
Jie Zhou
Fei Liu
HILM
43
2
0
20 Feb 2024
FactPICO: Factuality Evaluation for Plain Language Summarization of
  Medical Evidence
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence
Sebastian Antony Joseph
Lily Chen
Jan Trienes
Hannah Louisa Göke
Monika Coers
Wei Xu
Byron C. Wallace
Junyi Jessy Li
LM&MA
HILM
26
10
0
18 Feb 2024
Improving Factual Error Correction for Abstractive Summarization via
  Data Distillation and Conditional-generation Cloze
Improving Factual Error Correction for Abstractive Summarization via Data Distillation and Conditional-generation Cloze
Yiyang Li
Lei Li
Dingxing Hu
Xueyi Hao
Marina Litvak
N. Vanetik
Yanquan Zhou
HILM
KELM
24
0
0
13 Feb 2024
Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains
Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains
S. Ramprasad
Kundan Krishna
Zachary Chase Lipton
Byron C. Wallace
HILM
52
6
0
05 Feb 2024
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented
  Generation of Large Language Models
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
Yuanjie Lyu
Zhiyu Li
Pengnian Qi
Bo Tang
Simin Niu
Wenjin Wang
Hao Wu
Huan Liu
Tong Xu
Enhong Chen
RALM
44
33
0
30 Jan 2024
Combining Hierachical VAEs with LLMs for clinically meaningful timeline
  summarisation in social media
Combining Hierachical VAEs with LLMs for clinically meaningful timeline summarisation in social media
Jiayu Song
Jenny Chim
Adam Tsakalidis
Julia Ive
Dana Atzil-Slonim
M. Liakata
27
3
0
29 Jan 2024
Hallucination Detection and Hallucination Mitigation: An Investigation
Hallucination Detection and Hallucination Mitigation: An Investigation
Junliang Luo
Tianyu Li
Di Wu
Michael R. M. Jenkin
Steve Liu
Gregory Dudek
HILM
LLMAG
46
22
0
16 Jan 2024
JustiLM: Few-shot Justification Generation for Explainable Fact-Checking
  of Real-world Claims
JustiLM: Few-shot Justification Generation for Explainable Fact-Checking of Real-world Claims
Fengzhu Zeng
Wei Gao
41
15
0
16 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
63
57
0
11 Jan 2024
Convergences and Divergences between Automatic Assessment and Human
  Evaluation: Insights from Comparing ChatGPT-Generated Translation and Neural
  Machine Translation
Convergences and Divergences between Automatic Assessment and Human Evaluation: Insights from Comparing ChatGPT-Generated Translation and Neural Machine Translation
Zhaokun Jiang
Ziyin Zhang
EGVM
24
3
0
10 Jan 2024
Do Androids Know They're Only Dreaming of Electric Sheep?
Do Androids Know They're Only Dreaming of Electric Sheep?
Sky CH-Wang
Benjamin Van Durme
Jason Eisner
Chris Kedzie
HILM
35
27
0
28 Dec 2023
LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores
LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores
Yiqi Liu
N. Moosavi
Chenghua Lin
ELM
35
46
0
16 Nov 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven
  Negative Samples Generation
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
28
6
0
16 Nov 2023
A Survey on Hallucination in Large Language Models: Principles,
  Taxonomy, Challenges, and Open Questions
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRM
HILM
56
744
0
09 Nov 2023
FaMeSumm: Investigating and Improving Faithfulness of Medical
  Summarization
FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization
Nan Zhang
Yusen Zhang
Wu Guo
P. Mitra
Rui Zhang
HILM
43
4
0
03 Nov 2023
Post Turing: Mapping the landscape of LLM Evaluation
Post Turing: Mapping the landscape of LLM Evaluation
Alexey Tikhonov
Ivan P. Yamshchikov
ELM
51
4
0
03 Nov 2023
OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization
OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization
Yuchen Shen
Xiaojun Wan
38
9
0
27 Oct 2023
Background Summarization of Event Timelines
Background Summarization of Event Timelines
Adithya Pratapa
Kevin Small
Markus Dreyer
63
2
0
24 Oct 2023
'Don't Get Too Technical with Me': A Discourse Structure-Based Framework
  for Science Journalism
'Don't Get Too Technical with Me': A Discourse Structure-Based Framework for Science Journalism
Ronald Cardenas
Bingsheng Yao
Dakuo Wang
Yufang Hou
27
0
0
23 Oct 2023
Fast and Accurate Factual Inconsistency Detection Over Long Documents
Fast and Accurate Factual Inconsistency Detection Over Long Documents
B. Lattimer
Patrick Chen
Xinyuan Zhang
Yi Yang
HILM
11
18
0
19 Oct 2023
Zero-shot Faithfulness Evaluation for Text Summarization with Foundation
  Language Model
Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model
Qi Jia
Siyu Ren
Yizhu Liu
Kenny Q. Zhu
ALM
HILM
33
16
0
18 Oct 2023
Calibrating Likelihoods towards Consistency in Summarization Models
Calibrating Likelihoods towards Consistency in Summarization Models
Polina Zablotskaia
Misha Khalman
Rishabh Joshi
Livio Baldini Soares
Shoshana Jakobovits
Joshua Maynez
Shashi Narayan
31
3
0
12 Oct 2023
Simplicity Level Estimate (SLE): A Learned Reference-Less Metric for
  Sentence Simplification
Simplicity Level Estimate (SLE): A Learned Reference-Less Metric for Sentence Simplification
Liam Cripwell
Joël Legrand
Claire Gardent
29
13
0
12 Oct 2023
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive
  Summarisation
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive Summarisation
Jennifer A Bishop
Qianqian Xie
Sophia Ananiadou
HILM
25
10
0
21 Sep 2023
RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue
RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue
Zhengliang Shi
Weiwei Sun
Shuo Zhang
Zhen Zhang
Pengjie Ren
Z. Ren
24
8
0
15 Sep 2023
Cited Text Spans for Citation Text Generation
Cited Text Spans for Citation Text Generation
Xiangci Li
Yi-Hui Lee
Jessica Ouyang
3DV
22
6
0
12 Sep 2023
FaNS: a Facet-based Narrative Similarity Metric
FaNS: a Facet-based Narrative Similarity Metric
Mousumi Akter
Shubhra (Santu) Karmaker
25
1
0
09 Sep 2023
Evaluation of Faithfulness Using the Longest Supported Subsequence
Evaluation of Faithfulness Using the Longest Supported Subsequence
Anirudh Mittal
Timo Schick
Mikel Artetxe
Jane Dwivedi-Yu
ALM
27
0
0
23 Aug 2023
Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation
  Evaluation
Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation
Federico Betti
Jacopo Staiano
Lorenzo Baraldi
Lorenzo Baraldi
Rita Cucchiara
N. Sebe
EGVM
31
6
0
18 Jul 2023
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise
  Comparisons using Large Language Models
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models
Adian Liusie
Potsawee Manakul
Mark Gales
ELM
32
35
0
15 Jul 2023
Improving Factuality of Abstractive Summarization via Contrastive Reward
  Learning
Improving Factuality of Abstractive Summarization via Contrastive Reward Learning
Ethan Chern
Zhiruo Wang
Sanjan Das
Bhavuk Sharma
Pengfei Liu
Graham Neubig
HILM
17
14
0
10 Jul 2023
MISMATCH: Fine-grained Evaluation of Machine-generated Text with
  Mismatch Error Types
MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types
K. Murugesan
Sarathkrishna Swaminathan
Soham Dan
Subhajit Chaudhury
Chulaka Gunasekara
...
Ibrahim Abdelaziz
Achille Fokoue
Pavan Kapanipathi
Salim Roukos
Alexander G. Gray
42
5
0
18 Jun 2023
Reference Matters: Benchmarking Factual Error Correction for Dialogue
  Summarization with Fine-grained Evaluation Framework
Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework
Mingqi Gao
Xiaojun Wan
Jia Su
Zhefeng Wang
Baoxing Huai
HILM
18
8
0
08 Jun 2023
Diverse and Faithful Knowledge-Grounded Dialogue Generation via
  Sequential Posterior Inference
Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference
Yan Xu
Deqian Kong
Dehong Xu
Ziwei Ji
Bo Pang
Pascale Fung
Yingting Wu
29
6
0
01 Jun 2023
Improving the Robustness of Summarization Systems with Dual Augmentation
Improving the Robustness of Summarization Systems with Dual Augmentation
Preslav Nakov
Guodong Long
Chongyang Tao
Mingzhe Li
Xin Gao
Chen Zhang
Xiangliang Zhang
AAML
32
11
0
01 Jun 2023
Factually Consistent Summarization via Reinforcement Learning with
  Textual Entailment Feedback
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
...
Olivier Bachem
G. Elidan
Avinatan Hassidim
Olivier Pietquin
Idan Szpektor
HILM
28
79
0
31 May 2023
UMSE: Unified Multi-scenario Summarization Evaluation
UMSE: Unified Multi-scenario Summarization Evaluation
Shen Gao
Zhitao Yao
Chongyang Tao
Preslav Nakov
Pengjie Ren
Z. Ren
Zhumin Chen
35
5
0
26 May 2023
AlignScore: Evaluating Factual Consistency with a Unified Alignment
  Function
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
26
182
0
26 May 2023
Annotating and Detecting Fine-grained Factual Errors for Dialogue
  Summarization
Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
Rongxin Zhu
Jianzhong Qi
Jey Han Lau
49
10
0
26 May 2023
Improving Factuality of Abstractive Summarization without Sacrificing
  Summary Quality
Improving Factuality of Abstractive Summarization without Sacrificing Summary Quality
Tanay Dixit
Fei Wang
Muhao Chen
HILM
40
9
0
24 May 2023
AWESOME: GPU Memory-constrained Long Document Summarization using Memory
  Mechanism and Global Salient Content
AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content
Shuyang Cao
Lu Wang
30
5
0
24 May 2023
Previous
1234
Next