ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.12840
  4. Cited By
Evaluating the Factual Consistency of Abstractive Text Summarization

Evaluating the Factual Consistency of Abstractive Text Summarization

28 October 2019
Wojciech Kry'sciñski
Bryan McCann
Caiming Xiong
R. Socher
    HILM
ArXivPDFHTML

Papers citing "Evaluating the Factual Consistency of Abstractive Text Summarization"

50 / 463 papers shown
Title
FELM: Benchmarking Factuality Evaluation of Large Language Models
FELM: Benchmarking Factuality Evaluation of Large Language Models
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
41
33
0
01 Oct 2023
Overview of the BioLaySumm 2023 Shared Task on Lay Summarization of
  Biomedical Research Articles
Overview of the BioLaySumm 2023 Shared Task on Lay Summarization of Biomedical Research Articles
Tomas Goldsack
Jiancheng Yang
Qianqian Xie
Carolina Scarton
Matthew Shardlow
Sophia Ananiadou
Chenghua Lin
38
16
0
29 Sep 2023
STRONG -- Structure Controllable Legal Opinion Summary Generation
STRONG -- Structure Controllable Legal Opinion Summary Generation
Yang Zhong
Diane Litman
ELM
AILaw
30
1
0
29 Sep 2023
Hallucination Reduction in Long Input Text Summarization
Hallucination Reduction in Long Input Text Summarization
Gregor Lenz
Ronit Mandal
Abhishek Agarwal
Debarshi Kumar Sanyal
HILM
26
9
0
28 Sep 2023
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive
  Summarisation
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive Summarisation
Jennifer A Bishop
Qianqian Xie
Sophia Ananiadou
HILM
25
10
0
21 Sep 2023
Investigating Answerability of LLMs for Long-Form Question Answering
Investigating Answerability of LLMs for Long-Form Question Answering
Meghana Moorthy Bhat
Rui Meng
Ye Liu
Yingbo Zhou
Semih Yavuz
24
10
0
15 Sep 2023
Bias in News Summarization: Measures, Pitfalls and Corpora
Bias in News Summarization: Measures, Pitfalls and Corpora
Julius Steen
Katja Markert
28
4
0
14 Sep 2023
ExpertQA: Expert-Curated Questions and Attributed Answers
ExpertQA: Expert-Curated Questions and Attributed Answers
Chaitanya Malaviya
Subin Lee
Sihao Chen
Elizabeth Sieber
Mark Yatskar
Dan Roth
ELM
HILM
36
52
0
14 Sep 2023
Less is More for Long Document Summary Evaluation by LLMs
Less is More for Long Document Summary Evaluation by LLMs
Yunshu Wu
Hayate Iso
Pouya Pezeshkpour
Nikita Bhutani
Estevam R. Hruschka
24
34
0
14 Sep 2023
BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation
  Suite for Large Language Models
BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models
Wei Qi Leong
Jian Gang Ngui
Yosephine Susanto
Hamsawardhini Rengarajan
Kengatharaiyer Sarveswaran
William-Chandra Tjhi
29
9
0
12 Sep 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large
  Language Models
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
48
524
0
03 Sep 2023
Evaluation of Faithfulness Using the Longest Supported Subsequence
Evaluation of Faithfulness Using the Longest Supported Subsequence
Anirudh Mittal
Timo Schick
Mikel Artetxe
Jane Dwivedi-Yu
ALM
27
0
0
23 Aug 2023
Generating Faithful Text From a Knowledge Graph with Noisy Reference
  Text
Generating Faithful Text From a Knowledge Graph with Noisy Reference Text
Tahsina Hashem
Weiqing Wang
Derry Wijaya
Mohammed Eunus Ali
Yuan-Fang Li
29
3
0
12 Aug 2023
FacTool: Factuality Detection in Generative AI -- A Tool Augmented
  Framework for Multi-Task and Multi-Domain Scenarios
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Ethan Chern
Steffi Chern
Shiqi Chen
Weizhe Yuan
Kehua Feng
Chunting Zhou
Junxian He
Graham Neubig
Pengfei Liu
HILM
32
193
0
25 Jul 2023
Guidance in Radiology Report Summarization: An Empirical Evaluation and
  Error Analysis
Guidance in Radiology Report Summarization: An Empirical Evaluation and Error Analysis
Jan Trienes
Paul Youssef
Jorg Schlotterer
Christin Seifert
24
0
0
24 Jul 2023
Agreement Tracking for Multi-Issue Negotiation Dialogues
Agreement Tracking for Multi-Issue Negotiation Dialogues
Amogh Mannekote
Bonnie J. Dorr
K. Boyer
40
0
0
13 Jul 2023
Improving Factuality of Abstractive Summarization via Contrastive Reward
  Learning
Improving Factuality of Abstractive Summarization via Contrastive Reward Learning
Ethan Chern
Zhiruo Wang
Sanjan Das
Bhavuk Sharma
Pengfei Liu
Graham Neubig
HILM
17
14
0
10 Jul 2023
Text Alignment Is An Efficient Unified Model for Massive NLP Tasks
Text Alignment Is An Efficient Unified Model for Massive NLP Tasks
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
ALM
22
9
0
06 Jul 2023
Named Entity Inclusion in Abstractive Text Summarization
Named Entity Inclusion in Abstractive Text Summarization
S. Berezin
Tatiana Batura
39
7
0
05 Jul 2023
Challenges in Domain-Specific Abstractive Summarization and How to
  Overcome them
Challenges in Domain-Specific Abstractive Summarization and How to Overcome them
Anum Afzal
Juraj Vladika
Daniel Braun
Florian Matthes
HILM
30
10
0
03 Jul 2023
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
Christopher T. Small
Ivan Vendrov
Esin Durmus
Hadjar Homaei
Elizabeth Barry
Julien Cornebise
Ted Suzman
Deep Ganguli
Colin Megill
35
27
0
20 Jun 2023
MISMATCH: Fine-grained Evaluation of Machine-generated Text with
  Mismatch Error Types
MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types
K. Murugesan
Sarathkrishna Swaminathan
Soham Dan
Subhajit Chaudhury
Chulaka Gunasekara
...
Ibrahim Abdelaziz
Achille Fokoue
Pavan Kapanipathi
Salim Roukos
Alexander G. Gray
42
5
0
18 Jun 2023
Unifying Large Language Models and Knowledge Graphs: A Roadmap
Unifying Large Language Models and Knowledge Graphs: A Roadmap
Shirui Pan
Linhao Luo
Yufei Wang
Chen Chen
Jiapu Wang
Xindong Wu
KELM
40
723
0
14 Jun 2023
Boosting Language Models Reasoning with Chain-of-Knowledge Prompting
Boosting Language Models Reasoning with Chain-of-Knowledge Prompting
Jie Wang
Qiushi Sun
Xiang Li
Ming Gao
ReLM
LRM
26
65
0
10 Jun 2023
Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics
  and Prompt Wording
Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics and Prompt Wording
Aisha Khatun
Daniel Brown
KELM
18
12
0
09 Jun 2023
Reference Matters: Benchmarking Factual Error Correction for Dialogue
  Summarization with Fine-grained Evaluation Framework
Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework
Mingqi Gao
Xiaojun Wan
Jia Su
Zhefeng Wang
Baoxing Huai
HILM
16
8
0
08 Jun 2023
Multi-Dimensional Evaluation of Text Summarization with In-Context
  Learning
Multi-Dimensional Evaluation of Text Summarization with In-Context Learning
Sameer Jain
Vaishakh Keshava
Swarnashree Mysore Sathyendra
Patrick Fernandes
Pengfei Liu
Graham Neubig
Chunting Zhou
ELM
19
35
0
01 Jun 2023
Towards Argument-Aware Abstractive Summarization of Long Legal Opinions
  with Summary Reranking
Towards Argument-Aware Abstractive Summarization of Long Legal Opinions with Summary Reranking
Mohamed S. Elaraby
Yang Zhong
Diane Litman
AILaw
ELM
23
7
0
01 Jun 2023
Factually Consistent Summarization via Reinforcement Learning with
  Textual Entailment Feedback
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
...
Olivier Bachem
G. Elidan
Avinatan Hassidim
Olivier Pietquin
Idan Szpektor
HILM
28
79
0
31 May 2023
Contrastive Hierarchical Discourse Graph for Scientific Document
  Summarization
Contrastive Hierarchical Discourse Graph for Scientific Document Summarization
Haopeng Zhang
Xiao Liu
Jiawei Zhang
AILaw
16
9
0
31 May 2023
A Critical Evaluation of Evaluations for Long-form Question Answering
A Critical Evaluation of Evaluations for Long-form Question Answering
Fangyuan Xu
Yixiao Song
Mohit Iyyer
Eunsol Choi
ELM
37
97
0
29 May 2023
An Investigation of Evaluation Metrics for Automated Medical Note
  Generation
An Investigation of Evaluation Metrics for Automated Medical Note Generation
Asma Ben Abacha
Wen-wai Yim
George Michalopoulos
Thomas Lin
25
22
0
27 May 2023
With a Little Push, NLI Models can Robustly and Efficiently Predict
  Faithfulness
With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness
Julius Steen
Juri Opitz
Anette Frank
K. Markert
HILM
25
9
0
26 May 2023
AlignScore: Evaluating Factual Consistency with a Unified Alignment
  Function
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
26
182
0
26 May 2023
AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization
AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization
Guan-Hua Wang
Weihua Li
E. Lai
Quan-wei Bai
13
0
0
26 May 2023
Annotating and Detecting Fine-grained Factual Errors for Dialogue
  Summarization
Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
Rongxin Zhu
Jianzhong Qi
Jey Han Lau
49
10
0
26 May 2023
Neural Natural Language Processing for Long Texts: A Survey on
  Classification and Summarization
Neural Natural Language Processing for Long Texts: A Survey on Classification and Summarization
Dimitrios Tsirmpas
Ioannis Gkionis
Georgios Th. Papadopoulos
Ioannis Mademlis
AILaw
AI4TS
AI4CE
43
17
0
25 May 2023
Self-contradictory Hallucinations of Large Language Models: Evaluation,
  Detection and Mitigation
Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation
Niels Mündler
Jingxuan He
Slobodan Jenko
Martin Vechev
HILM
22
108
0
25 May 2023
Is Summary Useful or Not? An Extrinsic Human Evaluation of Text
  Summaries on Downstream Tasks
Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks
Xiao Pu
Mingqi Gao
Xiaojun Wan
ELM
26
3
0
24 May 2023
MuLER: Detailed and Scalable Reference-based Evaluation
MuLER: Detailed and Scalable Reference-based Evaluation
Taelin Karidi
Leshem Choshen
Gal Patel
Omri Abend
40
0
0
24 May 2023
Improving Factuality of Abstractive Summarization without Sacrificing
  Summary Quality
Improving Factuality of Abstractive Summarization without Sacrificing Summary Quality
Tanay Dixit
Fei Wang
Muhao Chen
HILM
40
9
0
24 May 2023
SummIt: Iterative Text Summarization via ChatGPT
SummIt: Iterative Text Summarization via ChatGPT
Haopeng Zhang
Xiao Liu
Jiawei Zhang
43
65
0
24 May 2023
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
Weijia Shi
Xiaochuang Han
M. Lewis
Yulia Tsvetkov
Luke Zettlemoyer
Scott Yih
HILM
27
191
0
24 May 2023
SciFix: Outperforming GPT3 on Scientific Factual Error Correction
SciFix: Outperforming GPT3 on Scientific Factual Error Correction
D. Ashok
Atharva Kulkarni
Hai Pham
Barnabas Poczos
30
1
0
24 May 2023
Scientific Opinion Summarization: Paper Meta-review Generation Dataset,
  Methods, and Evaluation
Scientific Opinion Summarization: Paper Meta-review Generation Dataset, Methods, and Evaluation
Qi Zeng
Mankeerat Sidhu
Ansel Blume
Hou Pong Chan
Lu Wang
Heng Ji
40
10
0
24 May 2023
Interpretable Automatic Fine-grained Inconsistency Detection in Text
  Summarization
Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarization
Hou Pong Chan
Qi Zeng
Chenhui Xu
HILM
29
12
0
23 May 2023
LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond
LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond
Philippe Laban
Wojciech Kry'sciñski
Divyansh Agarwal
Alexander R. Fabbri
Caiming Xiong
Chenyu You
Chien-Sheng Wu
ALM
HILM
35
33
0
23 May 2023
USB: A Unified Summarization Benchmark Across Tasks and Domains
USB: A Unified Summarization Benchmark Across Tasks and Domains
Kundan Krishna
Prakhar Gupta
S. Ramprasad
Byron C. Wallace
Jeffrey P. Bigham
Zachary Chase Lipton
HILM
43
8
0
23 May 2023
Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs
Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs
Angelica Chen
Jason Phang
Alicia Parrish
Vishakh Padmakumar
Chen Zhao
Sam Bowman
Kyunghyun Cho
ReLM
LRM
33
29
0
23 May 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
86
611
0
23 May 2023
Previous
12345...8910
Next