ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.04228
  4. Cited By
Asking and Answering Questions to Evaluate the Factual Consistency of
  Summaries

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

8 April 2020
Alex Jinpeng Wang
Kyunghyun Cho
M. Lewis
    HILM
ArXivPDFHTML

Papers citing "Asking and Answering Questions to Evaluate the Factual Consistency of Summaries"

50 / 327 papers shown
Title
RAGTruth: A Hallucination Corpus for Developing Trustworthy
  Retrieval-Augmented Language Models
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models
Cheng Niu
Yuanhao Wu
Juno Zhu
Siliang Xu
Kashun Shum
Randy Zhong
Juntong Song
Tong Zhang
HILM
28
87
0
31 Dec 2023
Do Androids Know They're Only Dreaming of Electric Sheep?
Do Androids Know They're Only Dreaming of Electric Sheep?
Sky CH-Wang
Benjamin Van Durme
Jason Eisner
Chris Kedzie
HILM
35
27
0
28 Dec 2023
Do Text Simplification Systems Preserve Meaning? A Human Evaluation via
  Reading Comprehension
Do Text Simplification Systems Preserve Meaning? A Human Evaluation via Reading Comprehension
Sweta Agrawal
Marine Carpuat
27
7
0
15 Dec 2023
Evaluating Large Language Models for Health-related Queries with
  Presuppositions
Evaluating Large Language Models for Health-related Queries with Presuppositions
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILM
ELM
38
2
0
14 Dec 2023
DelucionQA: Detecting Hallucinations in Domain-specific Question
  Answering
DelucionQA: Detecting Hallucinations in Domain-specific Question Answering
Mobashir Sadat
Zhengyu Zhou
Lukas Lange
Jun Araki
Arsalan Gundroo
Bingqing Wang
Rakesh R Menon
Md. Rizwan Parvez
Zhe Feng
HILM
37
37
0
08 Dec 2023
Perspectives on the State and Future of Deep Learning - 2023
Perspectives on the State and Future of Deep Learning - 2023
Micah Goldblum
A. Anandkumar
Richard Baraniuk
Tom Goldstein
Kyunghyun Cho
Zachary Chase Lipton
Melanie Mitchell
Preetum Nakkiran
Max Welling
Andrew Gordon Wilson
61
4
0
07 Dec 2023
P^3SUM: Preserving Author's Perspective in News Summarization with
  Diffusion Language Models
P^3SUM: Preserving Author's Perspective in News Summarization with Diffusion Language Models
Yuhan Liu
Shangbin Feng
Xiaochuang Han
Vidhisha Balachandran
Chan Young Park
Sachin Kumar
Yulia Tsvetkov
DiffM
44
2
0
16 Nov 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven
  Negative Samples Generation
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
28
6
0
16 Nov 2023
Investigating Hallucinations in Pruned Large Language Models for
  Abstractive Summarization
Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization
G. Chrysostomou
Zhixue Zhao
Miles Williams
Nikolaos Aletras
HILM
34
10
0
15 Nov 2023
Fusion-Eval: Integrating Assistant Evaluators with LLMs
Fusion-Eval: Integrating Assistant Evaluators with LLMs
Lei Shu
Nevan Wichers
Liangchen Luo
Yun Zhu
Yinxiao Liu
Jindong Chen
Lei Meng
ELM
15
3
0
15 Nov 2023
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented
  Instruction Tuning with Auxiliary Evaluation Aspects
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation Aspects
Minqian Liu
Ying Shen
Zhiyang Xu
Yixin Cao
Eunah Cho
Vaibhav Kumar
Reza Ghanadan
Lifu Huang
ELM
LM&MA
ALM
52
25
0
15 Nov 2023
A Survey on Hallucination in Large Language Models: Principles,
  Taxonomy, Challenges, and Open Questions
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRM
HILM
47
732
0
09 Nov 2023
First Tragedy, then Parse: History Repeats Itself in the New Era of
  Large Language Models
First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language Models
Naomi Saphra
Eve Fleisig
Kyunghyun Cho
Adam Lopez
LRM
30
8
0
08 Nov 2023
FaMeSumm: Investigating and Improving Faithfulness of Medical
  Summarization
FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization
Nan Zhang
Yusen Zhang
Wu Guo
P. Mitra
Rui Zhang
HILM
43
4
0
03 Nov 2023
Are NLP Models Good at Tracing Thoughts: An Overview of Narrative
  Understanding
Are NLP Models Good at Tracing Thoughts: An Overview of Narrative Understanding
Lixing Zhu
Runcong Zhao
Lin Gui
Yulan He
52
4
0
28 Oct 2023
Correction with Backtracking Reduces Hallucination in Summarization
Correction with Backtracking Reduces Hallucination in Summarization
Zhenzhen Liu
Chao-gang Wan
Varsha Kishore
Jin Peng Zhou
Minmin Chen
Kilian Q. Weinberger
HILM
26
3
0
24 Oct 2023
Language Models Hallucinate, but May Excel at Fact Verification
Language Models Hallucinate, but May Excel at Fact Verification
Jian Guan
Jesse Dodge
David Wadden
Minlie Huang
Hao Peng
LRM
HILM
34
28
0
23 Oct 2023
Chainpoll: A high efficacy method for LLM hallucination detection
Chainpoll: A high efficacy method for LLM hallucination detection
Robert Friel
Atindriyo Sanyal
LRM
HILM
34
26
0
22 Oct 2023
Fast and Accurate Factual Inconsistency Detection Over Long Documents
Fast and Accurate Factual Inconsistency Detection Over Long Documents
B. Lattimer
Patrick Chen
Xinyuan Zhang
Yi Yang
HILM
6
18
0
19 Oct 2023
Zero-shot Faithfulness Evaluation for Text Summarization with Foundation
  Language Model
Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model
Qi Jia
Siyu Ren
Yizhu Liu
Kenny Q. Zhu
ALM
HILM
33
16
0
18 Oct 2023
Metric Ensembles For Hallucination Detection
Metric Ensembles For Hallucination Detection
Grant C. Forbes
Parth Katlana
Zeydy Ortiz
HILM
41
4
0
16 Oct 2023
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
  LLM-Generated Reference Letters
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters
Yixin Wan
George Pu
Jiao Sun
Aparna Garimella
Kai-Wei Chang
Nanyun Peng
34
162
0
13 Oct 2023
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level
  Hallucination Detection
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
Sehyun Choi
Tianqing Fang
Zhaowei Wang
Yangqiu Song
35
32
0
13 Oct 2023
Beyond Factuality: A Comprehensive Evaluation of Large Language Models
  as Knowledge Generators
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators
Liang Chen
Yang Deng
Yatao Bian
Zeyu Qin
Bingzhe Wu
Tat-Seng Chua
Kam-Fai Wong
HILM
ELM
60
43
0
11 Oct 2023
Compressing Context to Enhance Inference Efficiency of Large Language
  Models
Compressing Context to Enhance Inference Efficiency of Large Language Models
Yucheng Li
Bo Dong
Chenghua Lin
Frank Guerin
19
57
0
09 Oct 2023
Chain of Natural Language Inference for Reducing Large Language Model
  Ungrounded Hallucinations
Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations
Deren Lei
Yaxi Li
Mengya Hu
Mingyu Wang
Vincent Yun
Emily Ching
Eslam Kamal
HILM
LRM
24
40
0
06 Oct 2023
Fusing Models with Complementary Expertise
Fusing Models with Complementary Expertise
Hongyi Wang
Felipe Maia Polo
Yuekai Sun
Souvik Kundu
Eric Xing
Mikhail Yurochkin
FedML
MoMe
28
26
0
02 Oct 2023
FELM: Benchmarking Factuality Evaluation of Large Language Models
FELM: Benchmarking Factuality Evaluation of Large Language Models
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
41
33
0
01 Oct 2023
Calibrating LLM-Based Evaluator
Calibrating LLM-Based Evaluator
Yuxuan Liu
Tianchi Yang
Shaohan Huang
Zihan Zhang
Haizhen Huang
Furu Wei
Weiwei Deng
Feng Sun
Qi Zhang
49
31
0
23 Sep 2023
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive
  Summarisation
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive Summarisation
Jennifer A Bishop
Qianqian Xie
Sophia Ananiadou
HILM
22
9
0
21 Sep 2023
ExpertQA: Expert-Curated Questions and Attributed Answers
ExpertQA: Expert-Curated Questions and Attributed Answers
Chaitanya Malaviya
Subin Lee
Sihao Chen
Elizabeth Sieber
Mark Yatskar
Dan Roth
ELM
HILM
31
52
0
14 Sep 2023
FaNS: a Facet-based Narrative Similarity Metric
FaNS: a Facet-based Narrative Similarity Metric
Mousumi Akter
Shubhra (Santu) Karmaker
25
1
0
09 Sep 2023
Zero-Resource Hallucination Prevention for Large Language Models
Zero-Resource Hallucination Prevention for Large Language Models
Junyu Luo
Cao Xiao
Fenglong Ma
HILM
31
16
0
06 Sep 2023
Evaluation of Faithfulness Using the Longest Supported Subsequence
Evaluation of Faithfulness Using the Longest Supported Subsequence
Anirudh Mittal
Timo Schick
Mikel Artetxe
Jane Dwivedi-Yu
ALM
27
0
0
23 Aug 2023
ALens: An Adaptive Domain-Oriented Abstract Writing Training Tool for
  Novice Researchers
ALens: An Adaptive Domain-Oriented Abstract Writing Training Tool for Novice Researchers
Chen Cheng
Ziang Li
Zhenhui Peng
Quan Li
24
0
0
08 Aug 2023
FacTool: Factuality Detection in Generative AI -- A Tool Augmented
  Framework for Multi-Task and Multi-Domain Scenarios
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Ethan Chern
Steffi Chern
Shiqi Chen
Weizhe Yuan
Kehua Feng
Chunting Zhou
Junxian He
Graham Neubig
Pengfei Liu
HILM
27
193
0
25 Jul 2023
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise
  Comparisons using Large Language Models
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models
Adian Liusie
Potsawee Manakul
Mark Gales
ELM
29
35
0
15 Jul 2023
Improving Factuality of Abstractive Summarization via Contrastive Reward
  Learning
Improving Factuality of Abstractive Summarization via Contrastive Reward Learning
Ethan Chern
Zhiruo Wang
Sanjan Das
Bhavuk Sharma
Pengfei Liu
Graham Neubig
HILM
12
14
0
10 Jul 2023
Text Alignment Is An Efficient Unified Model for Massive NLP Tasks
Text Alignment Is An Efficient Unified Model for Massive NLP Tasks
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
ALM
22
9
0
06 Jul 2023
Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
Ran Zhang
Jihed Ouni
Steffen Eger
32
6
0
22 Jun 2023
Neural models for Factual Inconsistency Classification with Explanations
Neural models for Factual Inconsistency Classification with Explanations
Tathagata Raha
Mukund Choudhary
Abhinav Menon
Harshit Gupta
KV Aditya Srivatsa
Manish Gupta
Vasudeva Varma
24
3
0
15 Jun 2023
SciLit: A Platform for Joint Scientific Literature Discovery,
  Summarization and Citation Generation
SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation
Nianlong Gu
Richard H. R. Hahnloser
66
5
0
06 Jun 2023
Multi-Dimensional Evaluation of Text Summarization with In-Context
  Learning
Multi-Dimensional Evaluation of Text Summarization with In-Context Learning
Sameer Jain
Vaishakh Keshava
Swarnashree Mysore Sathyendra
Patrick Fernandes
Pengfei Liu
Graham Neubig
Chunting Zhou
ELM
11
35
0
01 Jun 2023
Factually Consistent Summarization via Reinforcement Learning with
  Textual Entailment Feedback
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
...
Olivier Bachem
G. Elidan
Avinatan Hassidim
Olivier Pietquin
Idan Szpektor
HILM
28
77
0
31 May 2023
An Investigation of Evaluation Metrics for Automated Medical Note
  Generation
An Investigation of Evaluation Metrics for Automated Medical Note Generation
Asma Ben Abacha
Wen-wai Yim
George Michalopoulos
Thomas Lin
22
22
0
27 May 2023
With a Little Push, NLI Models can Robustly and Efficiently Predict
  Faithfulness
With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness
Julius Steen
Juri Opitz
Anette Frank
K. Markert
HILM
25
9
0
26 May 2023
AlignScore: Evaluating Factual Consistency with a Unified Alignment
  Function
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
21
180
0
26 May 2023
Annotating and Detecting Fine-grained Factual Errors for Dialogue
  Summarization
Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
Rongxin Zhu
Jianzhong Qi
Jey Han Lau
44
10
0
26 May 2023
Mastering the ABCDs of Complex Questions: Answer-Based Claim
  Decomposition for Fine-grained Self-Evaluation
Mastering the ABCDs of Complex Questions: Answer-Based Claim Decomposition for Fine-grained Self-Evaluation
Nishant Balepur
Jie Huang
Samraj Moorjani
Hari Sundaram
Kevin Chen-Chuan Chang
ReLM
32
0
0
24 May 2023
Enabling Large Language Models to Generate Text with Citations
Enabling Large Language Models to Generate Text with Citations
Tianyu Gao
Howard Yen
Jiatong Yu
Danqi Chen
LM&MA
HILM
40
315
0
24 May 2023
Previous
1234567
Next