Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.13249
Cited By
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
20 February 2024
Liyan Tang
Igor Shalyminov
Amy Wing-mei Wong
Jon Burnsky
Jake W. Vincent
Yuán Yang
Siffi Singh
Song Feng
Hwanjun Song
Hang Su
Lijia Sun
Yi Zhang
Saab Mansour
Kathleen McKeown
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization"
29 / 29 papers shown
Title
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
Manveer Singh Tamber
F. S. Bao
Chenyu Xu
Ge Luo
Suleman Kazi
Minseok Bae
Miaoran Li
Ofer Mendelevitch
Renyi Qu
Jimmy J. Lin
VLM
33
0
0
07 May 2025
From Speech to Summary: A Comprehensive Survey of Speech Summarization
Fabian Retkowski
Maike Züfle
Andreas Sudmann
Dinah Pfau
Jan Niehues
Alexander Waibel
46
0
0
10 Apr 2025
ReFeed: Multi-dimensional Summarization Refinement with Reflective Reasoning on Feedback
Taewon Yun
Jihwan Oh
Hyangsuk Min
Yuho Lee
Jihwan Bang
Jason (Jinglun) Cai
Hwanjun Song
OffRL
LRM
39
0
0
27 Mar 2025
OAEI-LLM-T: A TBox Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching
Zhangcheng Qiang
Kerry Taylor
Weiqing Wang
Jing Jiang
52
0
0
25 Mar 2025
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration
David Wan
Justin Chih-Yao Chen
Elias Stengel-Eskin
Joey Tianyi Zhou
LLMAG
LRM
65
1
0
19 Mar 2025
Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data
Swati Rallapalli
Shannon Gallagher
Andrew O. Mellinger
Jasmine Ratchford
Anusha Sinha
Tyler Brooks
William R. Nichols
Nick Winski
Bryan Brown
48
0
0
10 Mar 2025
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data
Deren Lei
Yaxi Li
Siyao Li
Mengya Hu
Rui Xu
Ken Archer
Mingyu Wang
Emily Ching
Alex Deng
SyDa
HILM
LRM
73
1
0
28 Jan 2025
Learning to Summarize from LLM-generated Feedback
Hwanjun Song
Taewon Yun
Yuho Lee
Jihwan Oh
Gihun Lee
Jason (Jinglun) Cai
Hang Su
73
3
0
28 Jan 2025
Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Ameya Godbole
Robin Jia
HILM
53
1
0
24 Jan 2025
SummExecEdit: A Factual Consistency Benchmark in Summarization with Executable Edits
Onkar Thorat
Philippe Laban
C. Wu
HILM
83
0
0
17 Dec 2024
Learning to Verify Summary Facts with Fine-Grained LLM Feedback
Jihwan Oh
J. Choi
Nicole Hee-Yeon Kim
Taewon Yun
Hwanjun Song
SyDa
ALM
HILM
76
1
0
14 Dec 2024
Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation
S. Ramprasad
Byron C. Wallace
LLMAG
HILM
87
2
0
25 Nov 2024
On Positional Bias of Faithfulness for Long-form Summarization
David Wan
Jesse Vig
Joey Tianyi Zhou
Chenyu You
HILM
58
3
0
31 Oct 2024
FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs
F. S. Bao
Miaoran Li
Renyi Qu
Ge Luo
Erana Wan
...
Ruixuan Tu
Chenyu Xu
Matthew Gonzales
Ofer Mendelevitch
Amin Ahmad
VLM
HILM
28
3
0
17 Oct 2024
Steering LLM Summarization with Visual Workspaces for Sensemaking
Xuxin Tang
Eric Krokos
Can Liu
Kylie Davidson
Kirsten Whitley
Naren Ramakrishnan
Chris North
13
0
0
25 Sep 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Joey Tianyi Zhou
85
5
0
11 Sep 2024
Zero-shot Factual Consistency Evaluation Across Domains
Raunak Agarwal
HILM
47
0
0
07 Aug 2024
Localizing and Mitigating Errors in Long-form Question Answering
Rachneet Sachdeva
Yixiao Song
Mohit Iyyer
Iryna Gurevych
HILM
52
0
0
16 Jul 2024
STORYSUMM: Evaluating Faithfulness in Story Summarization
Melanie Subbiah
Faisal Ladhak
Akankshya Mishra
Griffin Adams
Lydia B. Chilton
Kathleen McKeown
50
4
0
09 Jul 2024
Learning to Refine with Fine-Grained Natural Language Feedback
Manya Wadhwa
Xinyu Zhao
Junyi Jessy Li
Greg Durrett
37
12
0
02 Jul 2024
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual Errors
Alex Chandler
Devesh Surve
Hui Su
HILM
UQCV
31
1
0
18 Jun 2024
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
Haopeng Zhang
Philip S. Yu
Jiawei Zhang
37
17
0
17 Jun 2024
A Survey of Useful LLM Evaluation
Ji-Lun Peng
Sijia Cheng
Egil Diau
Yung-Yu Shih
Po-Heng Chen
Yen-Ting Lin
Yun-Nung Chen
LLMAG
ELM
34
12
0
03 Jun 2024
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document
Joonho Yang
Seunghyun Yoon
Byeongjeong Kim
Hwanhee Lee
HILM
34
4
0
17 Apr 2024
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang
Philippe Laban
Greg Durrett
HILM
SyDa
43
76
0
16 Apr 2024
Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers
Melanie Subbiah
Sean Zhang
Lydia B. Chilton
Kathleen McKeown
54
14
0
02 Mar 2024
ExpertQA: Expert-Curated Questions and Attributed Answers
Chaitanya Malaviya
Subin Lee
Sihao Chen
Elizabeth Sieber
Mark Yatskar
Dan Roth
ELM
HILM
28
50
0
14 Sep 2023
Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system
Sumit Asthana
Sagi Hilleli
Pengcheng He
Aaron L Halfaker
37
11
0
28 Jul 2023
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics
Artidoro Pagnoni
Vidhisha Balachandran
Yulia Tsvetkov
HILM
231
306
0
27 Apr 2021
1