ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.08542
  4. Cited By
QAFactEval: Improved QA-Based Factual Consistency Evaluation for
  Summarization

QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization

16 December 2021
Alexander R. Fabbri
C. Wu
Wenhao Liu
Caiming Xiong
    HILM
ArXivPDFHTML

Papers citing "QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization"

50 / 167 papers shown
Title
Molecular Facts: Desiderata for Decontextualization in LLM Fact
  Verification
Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification
Anisha Gunjal
Greg Durrett
HILM
58
13
0
28 Jun 2024
One Thousand and One Pairs: A "novel" challenge for long-context
  language models
One Thousand and One Pairs: A "novel" challenge for long-context language models
Marzena Karpinska
Katherine Thai
Kyle Lo
Tanya Goyal
Mohit Iyyer
LRM
45
41
0
24 Jun 2024
Towards Fine-Grained Citation Evaluation in Generated Text: A
  Comparative Analysis of Faithfulness Metrics
Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics
Weijia Zhang
Mohammad Aliannejadi
Yifei Yuan
Jiahuan Pei
Jia-Hong Huang
Evangelos Kanoulas
HILM
33
12
0
21 Jun 2024
Factual Dialogue Summarization via Learning from Large Language Models
Factual Dialogue Summarization via Learning from Large Language Models
Rongxin Zhu
Jey Han Lau
Jianzhong Qi
HILM
60
1
0
20 Jun 2024
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM
  Framework for Detecting Factual Errors
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual Errors
Alex Chandler
Devesh Surve
Hui Su
HILM
UQCV
31
1
0
18 Jun 2024
Analyzing LLM Behavior in Dialogue Summarization: Unveiling
  Circumstantial Hallucination Trends
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends
S. Ramprasad
Elisa Ferracane
Zachary Chase Lipton
HILM
19
12
0
05 Jun 2024
PatentEval: Understanding Errors in Patent Generation
PatentEval: Understanding Errors in Patent Generation
You Zuo
Kim Gerdes
Eric Villemonte de la Clergerie
Benoît Sagot
39
1
0
05 Jun 2024
Faithful Chart Summarization with ChaTS-Pi
Faithful Chart Summarization with ChaTS-Pi
Syrine Krichene
Francesco Piccinno
Fangyu Liu
Julian Martin Eisenschlos
44
1
0
29 May 2024
Text Generation: A Systematic Literature Review of Tasks, Evaluation,
  and Challenges
Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges
Jonas Becker
Jan Philip Wahle
Bela Gipp
Terry Ruas
37
9
0
24 May 2024
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
Minbyul Jeong
Hyeon Hwang
Chanwoong Yoon
Taewhoo Lee
Jaewoo Kang
MedIm
HILM
LM&MA
50
12
0
21 May 2024
ALMol: Aligned Language-Molecule Translation LLMs through Offline
  Preference Contrastive Optimisation
ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation
Dimitris Gkoumas
39
0
0
14 May 2024
One vs. Many: Comprehending Accurate Information from Multiple Erroneous
  and Inconsistent AI Generations
One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations
Yoonjoo Lee
Kihoon Son
Tae Soo Kim
Jisu Kim
John Joon Young Chung
Eytan Adar
Juho Kim
39
11
0
09 May 2024
Can We Catch the Elephant? A Survey of the Evolvement of Hallucination
  Evaluation on Natural Language Generation
Can We Catch the Elephant? A Survey of the Evolvement of Hallucination Evaluation on Natural Language Generation
Siya Qi
Yulan He
Zheng Yuan
LRM
HILM
54
1
0
18 Apr 2024
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out
  Document
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document
Joonho Yang
Seunghyun Yoon
Byeongjeong Kim
Hwanhee Lee
HILM
36
5
0
17 Apr 2024
Fewer Truncations Improve Language Modeling
Fewer Truncations Improve Language Modeling
Hantian Ding
Zijian Wang
Giovanni Paolini
Varun Kumar
Anoop Deoras
Dan Roth
Stefano Soatto
63
13
0
16 Apr 2024
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang
Philippe Laban
Greg Durrett
HILM
SyDa
43
80
0
16 Apr 2024
Less is More for Improving Automatic Evaluation of Factual Consistency
Less is More for Improving Automatic Evaluation of Factual Consistency
Tong Wang
Ninad Kulkarni
Yanjun Qi
ALM
49
2
0
09 Apr 2024
Select and Summarize: Scene Saliency for Movie Script Summarization
Select and Summarize: Scene Saliency for Movie Script Summarization
Rohit Saxena
Frank Keller
42
2
0
04 Apr 2024
Evaluating Document Simplification: On the Importance of Separately
  Assessing Simplicity and Meaning Preservation
Evaluating Document Simplification: On the Importance of Separately Assessing Simplicity and Meaning Preservation
Liam Cripwell
Joël Legrand
Claire Gardent
31
3
0
04 Apr 2024
Hallucination Diversity-Aware Active Learning for Text Summarization
Hallucination Diversity-Aware Active Learning for Text Summarization
Yu Xia
Xu Liu
Tong Yu
Sungchul Kim
Ryan A. Rossi
Anup B. Rao
Tung Mai
Shuai Li
HILM
45
3
0
02 Apr 2024
ReflectSumm: A Benchmark for Course Reflection Summarization
ReflectSumm: A Benchmark for Course Reflection Summarization
Yang Zhong
Mohamed S. Elaraby
Diane Litman
A. Butt
Muhsin Menekse
36
1
0
27 Mar 2024
A Closer Look at Claim Decomposition
A Closer Look at Claim Decomposition
Miriam Wanner
Seth Ebner
Zhengping Jiang
Mark Dredze
Benjamin Van Durme
49
18
0
18 Mar 2024
ROUGE-K: Do Your Summaries Have Keywords?
ROUGE-K: Do Your Summaries Have Keywords?
Sotaro Takeshita
Simone Paolo Ponzetto
Kai Eckert
29
0
0
08 Mar 2024
FENICE: Factuality Evaluation of summarization based on Natural language
  Inference and Claim Extraction
FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction
Alessandro Sciré
Karim Ghonim
Roberto Navigli
HILM
34
8
0
04 Mar 2024
Reading Subtext: Evaluating Large Language Models on Short Story
  Summarization with Writers
Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers
Melanie Subbiah
Sean Zhang
Lydia B. Chilton
Kathleen McKeown
54
14
0
02 Mar 2024
Self-Consistent Decoding for More Factual Open Responses
Self-Consistent Decoding for More Factual Open Responses
Christopher Malon
Xiaodan Zhu
HILM
51
3
0
01 Mar 2024
How Much Annotation is Needed to Compare Summarization Models?
How Much Annotation is Needed to Compare Summarization Models?
Chantal Shaib
Joe Barrow
Alexa F. Siu
Byron C. Wallace
A. Nenkova
59
2
0
28 Feb 2024
Entity-level Factual Adaptiveness of Fine-tuning based Abstractive
  Summarization Models
Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models
Jongyoon Song
Nohil Park
Bongkyu Hwang
Jaewoong Yun
Seongho Joe
Youngjune Gwon
Sungroh Yoon
KELM
HILM
43
1
0
23 Feb 2024
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Preslav Nakov
Tairan Wang
Qingqing Zhu
Taicheng Guo
Shen Gao
Zhiyong Lu
Xin Gao
Xiangliang Zhang
80
2
0
22 Feb 2024
Factual consistency evaluation of summarization in the Era of large language models
Factual consistency evaluation of summarization in the Era of large language models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
HILM
35
1
0
21 Feb 2024
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue
  Summarization
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Liyan Tang
Igor Shalyminov
Amy Wing-mei Wong
Jon Burnsky
Jake W. Vincent
...
Hang Su
Lijia Sun
Yi Zhang
Saab Mansour
Kathleen McKeown
HILM
29
45
0
20 Feb 2024
Identifying Factual Inconsistencies in Summaries: Grounding Model
  Inference via Task Taxonomy
Identifying Factual Inconsistencies in Summaries: Grounding Model Inference via Task Taxonomy
Liyan Xu
Zhenlin Su
Mo Yu
Jin Xu
Jinho D. Choi
Jie Zhou
Fei Liu
HILM
45
2
0
20 Feb 2024
GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence
GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence
Kundan Krishna
S. Ramprasad
Prakhar Gupta
Byron C. Wallace
Zachary Chase Lipton
Jeffrey P. Bigham
HILM
KELM
SyDa
57
9
0
19 Feb 2024
FactPICO: Factuality Evaluation for Plain Language Summarization of
  Medical Evidence
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence
Sebastian Antony Joseph
Lily Chen
Jan Trienes
Hannah Louisa Göke
Monika Coers
Wei Xu
Byron C. Wallace
Junyi Jessy Li
LM&MA
HILM
26
10
0
18 Feb 2024
Improving Factual Error Correction for Abstractive Summarization via
  Data Distillation and Conditional-generation Cloze
Improving Factual Error Correction for Abstractive Summarization via Data Distillation and Conditional-generation Cloze
Yiyang Li
Lei Li
Dingxing Hu
Xueyi Hao
Marina Litvak
N. Vanetik
Yanquan Zhou
HILM
KELM
24
0
0
13 Feb 2024
Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains
Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains
S. Ramprasad
Kundan Krishna
Zachary Chase Lipton
Byron C. Wallace
HILM
52
6
0
05 Feb 2024
Fine-grained Hallucination Detection and Editing for Language Models
Fine-grained Hallucination Detection and Editing for Language Models
Abhika Mishra
Akari Asai
Vidhisha Balachandran
Yizhong Wang
Graham Neubig
Yulia Tsvetkov
Hannaneh Hajishirzi
HILM
37
79
0
12 Jan 2024
Structsum Generation for Faster Text Comprehension
Structsum Generation for Faster Text Comprehension
Parag Jain
Andreea Marzoca
Francesco Piccinno
ReLM
39
5
0
12 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
63
57
0
11 Jan 2024
Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question
  Answering and Summarization
Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question Answering and Summarization
Do Xuan Long
Mohammad Hassanpour
Ahmed Masry
P. Kavehzadeh
Enamul Hoque
Chenyu You
LRM
30
9
0
17 Dec 2023
Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in
  Chart Captioning
Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning
Kung-Hsiang Huang
Mingyang Zhou
Hou Pong Chan
Yi R. Fung
Zhenhailong Wang
Lingyu Zhang
Shih-Fu Chang
Chenhui Xu
23
33
0
15 Dec 2023
Do Text Simplification Systems Preserve Meaning? A Human Evaluation via
  Reading Comprehension
Do Text Simplification Systems Preserve Meaning? A Human Evaluation via Reading Comprehension
Sweta Agrawal
Marine Carpuat
27
7
0
15 Dec 2023
Evaluating Large Language Models for Health-related Queries with
  Presuppositions
Evaluating Large Language Models for Health-related Queries with Presuppositions
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILM
ELM
38
2
0
14 Dec 2023
Walking a Tightrope -- Evaluating Large Language Models in High-Risk
  Domains
Walking a Tightrope -- Evaluating Large Language Models in High-Risk Domains
Chia-Chien Hung
Wiem Ben-Rim
Lindsay Frost
Lars Bruckner
Carolin (Haas) Lawrence
AILaw
ALM
ELM
25
9
0
25 Nov 2023
Pregnant Questions: The Importance of Pragmatic Awareness in Maternal
  Health Question Answering
Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question Answering
Neha Srikanth
Rupak Sarkar
Heran Mane
Elizabeth M. Aparicio
Quynh C. Nguyen
Rachel Rudinger
Jordan Lee Boyd-Graber
19
2
0
16 Nov 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven
  Negative Samples Generation
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
32
6
0
16 Nov 2023
A Survey on Hallucination in Large Language Models: Principles,
  Taxonomy, Challenges, and Open Questions
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRM
HILM
58
751
0
09 Nov 2023
FaMeSumm: Investigating and Improving Faithfulness of Medical
  Summarization
FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization
Nan Zhang
Yusen Zhang
Wu Guo
P. Mitra
Rui Zhang
HILM
43
4
0
03 Nov 2023
Are Large Language Models Reliable Judges? A Study on the Factuality
  Evaluation Capabilities of LLMs
Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs
Xue-Yong Fu
Md Tahmid Rahman Laskar
Cheng-Hsiung Chen
TN ShashiBhushan
HILM
ELM
68
18
0
01 Nov 2023
LitCab: Lightweight Language Model Calibration over Short- and Long-form
  Responses
LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses
Xin Liu
Muhammad Khalifa
Lu Wang
ALM
39
18
0
30 Oct 2023
Previous
1234
Next