Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.03754
Cited By
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization
7 May 2020
Esin Durmus
He He
Mona T. Diab
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization"
50 / 102 papers shown
Title
Integrating Video and Text: A Balanced Approach to Multimodal Summary Generation and Evaluation
Galann Pennec
Zhengyuan Liu
Nicholas Asher
Philippe Muller
Nancy F. Chen
VGen
31
0
0
10 May 2025
Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection
Atharva Kulkarni
Yuan-kang Zhang
Joel Ruben Antony Moniz
Xiou Ge
Bo-Hsiang Tseng
Dhivya Piraviperumal
Shri Kiran Srinivasan
Hong-ye Yu
HILM
86
0
0
25 Apr 2025
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
Jiaxing Wu
Lin Ning
Luyang Liu
Harrison Lee
Neo Wu
Chao Wang
Sushant Prakash
S. O’Banion
Bradley Green
Jun Xie
71
1
0
20 Jan 2025
SteLLA: A Structured Grading System Using LLMs with RAG
Hefei Qiu
Brian White
Ashley Ding
Reinaldo Costa
Ali Hachem
Wei Ding
Ping Chen
AI4Ed
61
0
0
17 Jan 2025
Fine-grained and Explainable Factuality Evaluation for Multimodal Summarization
Liqiang Jing
Jingxuan Zuo
Yue Zhang
47
7
0
31 Dec 2024
EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
Mengna Zhu
Kaisheng Zeng
Mao Wang
Kaiming Xiao
Lei Hou
Hongbin Huang
Juanzi Li
206
1
0
16 Dec 2024
STORYSUMM: Evaluating Faithfulness in Story Summarization
Melanie Subbiah
Faisal Ladhak
Akankshya Mishra
Griffin Adams
Lydia B. Chilton
Kathleen McKeown
50
4
0
09 Jul 2024
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
Yuzhe Gu
Ziwei Ji
Wenwei Zhang
Chengqi Lyu
Dahua Lin
Kai Chen
HILM
39
5
0
05 Jul 2024
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs
Jannik Kossen
Jiatong Han
Muhammed Razzak
Lisa Schut
Shreshth A. Malik
Yarin Gal
HILM
60
34
0
22 Jun 2024
Factual Dialogue Summarization via Learning from Large Language Models
Rongxin Zhu
Jey Han Lau
Jianzhong Qi
HILM
52
1
0
20 Jun 2024
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
Bairu Hou
Yang Zhang
Jacob Andreas
Shiyu Chang
77
5
0
11 Jun 2024
A Closer Look at Claim Decomposition
Miriam Wanner
Seth Ebner
Zhengping Jiang
Mark Dredze
Benjamin Van Durme
49
18
0
18 Mar 2024
German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset
Laura Mascarell
Ribin Chalumattu
Annette Rios
HILM
46
0
0
06 Mar 2024
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
65
29
0
02 Feb 2024
Fidelity-Enriched Contrastive Search: Reconciling the Faithfulness-Diversity Trade-Off in Text Generation
Wei-Lin Chen
Cheng-Kuang Wu
Hsin-Hsi Chen
Chung-Chi Chen
HILM
26
6
0
23 Oct 2023
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Hongbin Ye
Tong Liu
Aijia Zhang
Wei Hua
Weiqiang Jia
HILM
48
76
0
13 Sep 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
46
522
0
03 Sep 2023
Knowledge Graph for NLG in the context of conversational agents
Hussam Ghanem
Massinissa Atmani
C. Cruz
26
1
0
04 Jul 2023
UMSE: Unified Multi-scenario Summarization Evaluation
Shen Gao
Zhitao Yao
Chongyang Tao
Xiuying Chen
Pengjie Ren
Z. Ren
Zhumin Chen
30
5
0
26 May 2023
Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
Rongxin Zhu
Jianzhong Qi
Jey Han Lau
42
9
0
26 May 2023
Evaluating Factual Consistency of Summaries with Large Language Models
Shiqi Chen
Siyang Gao
Junxian He
ELM
LRM
HILM
35
6
0
23 May 2023
Evaluating Factual Consistency of Texts with Semantic Role Labeling
Jing Fan
Dennis Aumiller
Michael Gertz
HILM
34
4
0
22 May 2023
A Method to Automate the Discharge Summary Hospital Course for Neurology Patients
Vince C. Hartman
Sanika S. Bapat
M. Weiner
B. Navi
E. Sholle
T. Campion
32
18
0
10 May 2023
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Nico Daheim
Nouha Dziri
Mrinmaya Sachan
Iryna Gurevych
E. Ponti
MoMe
34
30
0
30 Mar 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Yang Liu
Dan Iter
Yichong Xu
Shuohang Wang
Ruochen Xu
Chenguang Zhu
ELM
ALM
LM&MA
53
1,078
0
29 Mar 2023
Benchmarking Large Language Models for News Summarization
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Percy Liang
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
43
478
0
31 Jan 2023
Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
Weijia Xu
Sweta Agrawal
Eleftheria Briakou
Marianna J. Martindale
Marine Carpuat
HILM
27
46
0
18 Jan 2023
Contrastive Error Attribution for Finetuned Language Models
Faisal Ladhak
Esin Durmus
Tatsunori Hashimoto
HILM
30
9
0
21 Dec 2022
WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning
Wenhao Wu
Wei Li
Xinyan Xiao
Jiachen Liu
Sujian Li
Yajuan Lv
HILM
26
4
0
20 Dec 2022
BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics
Liang Ma
Shuyang Cao
IV RobertL.Logan
Di Lu
Shihao Ran
Kecheng Zhang
Joel R. Tetreault
A. Jaimes
17
6
0
20 Dec 2022
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
58
99
0
19 Dec 2022
Improving Faithfulness of Abstractive Summarization by Controlling Confounding Effect of Irrelevant Sentences
Asish Ghoshal
Arash Einolghozati
A. Arun
Haoran Li
L. Yu
Vera Gor
Yashar Mehdad
Scott Yih
Asli Celikyilmaz
HILM
29
1
0
19 Dec 2022
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Yixin Liu
Alexander R. Fabbri
Pengfei Liu
Yilun Zhao
Linyong Nan
...
Simeng Han
Chenyu You
Chien-Sheng Wu
Caiming Xiong
Dragomir R. Radev
ALM
24
132
0
15 Dec 2022
RHO (
ρ
ρ
ρ
): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding
Ziwei Ji
Zihan Liu
Nayeon Lee
Tiezheng Yu
Bryan Wilie
Mini Zeng
Pascale Fung
HILM
23
53
0
03 Dec 2022
HaRiM
+
^+
+
: Evaluating Summary Quality with Hallucination Risk
Seonil Son
Junsoo Park
J. Hwang
Junghwa Lee
Hyungjong Noh
Yeonsoo Lee
HILM
16
8
0
22 Nov 2022
Consecutive Question Generation via Dynamic Multitask Learning
Yun Li
Sujian Li
Xing Shi
LRM
24
2
0
16 Nov 2022
Questioning the Validity of Summarization Datasets and Improving Their Factual Consistency
Yanzhu Guo
Chloé Clavel
Moussa Kamal Eddine
Michalis Vazirgiannis
HILM
32
11
0
31 Oct 2022
How Far are We from Robust Long Abstractive Summarization?
Huan Yee Koh
Jiaxin Ju
He Zhang
Ming Liu
Shirui Pan
HILM
28
39
0
30 Oct 2022
Analyzing and Evaluating Faithfulness in Dialogue Summarization
Bin Wang
Chen Zhang
Yan Zhang
Yiming Chen
Haizhou Li
HILM
41
14
0
21 Oct 2022
Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches and Future Directions
Qi Jia
Yizhu Liu
Siyu Ren
Kenny Q. Zhu
29
6
0
18 Oct 2022
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Ming Zhong
Yang Liu
Da Yin
Yuning Mao
Yizhu Jiao
Peng Liu
Chenguang Zhu
Heng Ji
Jiawei Han
ELM
45
255
0
13 Oct 2022
Shortcomings of Question Answering Based Factuality Frameworks for Error Localization
Ryo Kamoi
Tanya Goyal
Greg Durrett
HILM
33
14
0
13 Oct 2022
Just ClozE! A Novel Framework for Evaluating the Factual Consistency Faster in Abstractive Summarization
Yiyang Li
Lei Li
Marina Litvak
N. Vanetik
Dingxing Hu
Yuze Li
Yanquan Zhou
HILM
40
0
0
06 Oct 2022
Entity-based SpanCopy for Abstractive Summarization to Improve the Factual Consistency
Wen Xiao
Giuseppe Carenini
HILM
37
16
0
07 Sep 2022
SMART: Sentences as Basic Units for Text Evaluation
Reinald Kim Amplayo
Peter J. Liu
Yao-Min Zhao
Shashi Narayan
30
21
0
01 Aug 2022
Improving the Faithfulness of Abstractive Summarization via Entity Coverage Control
Haopeng Zhang
Semih Yavuz
Wojciech Kry'sciñski
Kazuma Hashimoto
Yingbo Zhou
HILM
35
34
0
05 Jul 2022
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
81
122
0
03 Jul 2022
QA Is the New KR: Question-Answer Pairs as Knowledge Bases
Wenhu Chen
William W. Cohen
Michiel de Jong
Nitish Gupta
Alessandro Presta
Pat Verga
John Wieting
27
7
0
01 Jul 2022
Conditional Generation with a Question-Answering Blueprint
Shashi Narayan
Joshua Maynez
Reinald Kim Amplayo
Kuzman Ganchev
Annie Louis
Fantine Huot
Anders Sandholm
Dipanjan Das
Mirella Lapata
59
47
0
01 Jul 2022
SQuALITY: Building a Long-Document Summarization Dataset the Hard Way
Alex Jinpeng Wang
Richard Yuanzhe Pang
Angelica Chen
Jason Phang
Samuel R. Bowman
74
44
0
23 May 2022
1
2
3
Next