FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization

7 May 2020

Esin Durmus

Papers citing "FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization"

50 / 109 papers shown

Title
QA Is the New KR: Question-Answer Pairs as Knowledge Bases Wenhu Chen William W. Cohen Michiel de Jong Nitish Gupta Alessandro Presta Pat Verga John Wieting 27 7 0 01 Jul 2022
Conditional Generation with a Question-Answering Blueprint Shashi Narayan Joshua Maynez Reinald Kim Amplayo Kuzman Ganchev Annie Louis Fantine Huot Anders Sandholm Dipanjan Das Mirella Lapata 61 47 0 01 Jul 2022
SQuALITY: Building a Long-Document Summarization Dataset the Hard Way Alex Jinpeng Wang Richard Yuanzhe Pang Angelica Chen Jason Phang Samuel R. Bowman 74 44 0 23 May 2022
QASem Parsing: Text-to-text Modeling of QA-based Semantics Ayal Klein Eran Hirsch Ron Eliav Valentina Pyatkin Avi Caciularu Ido Dagan 38 12 0 23 May 2022
TempLM: Distilling Language Models into Template-Based Generators Tianyi Zhang Mina Lee Lisa Li Ende Shen Tatsunori B. Hashimoto VLM 40 5 0 23 May 2022
Generating Literal and Implied Subquestions to Fact-check Complex Claims Jifan Chen Aniruddh Sriram Eunsol Choi Greg Durrett HILM 36 60 0 14 May 2022
Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization Prasetya Ajie Utama Joshua Bambrick N. Moosavi Iryna Gurevych HILM 16 42 0 12 May 2022
Efficient Few-Shot Fine-Tuning for Opinion Summarization Arthur Bravzinskas Ramesh Nallapati Joey Tianyi Zhou Markus Dreyer 19 24 0 04 May 2022
All You May Need for VQA are Image Captions Soravit Changpinyo Doron Kukliansky Idan Szpektor Xi Chen Nan Ding Radu Soricut 32 70 0 04 May 2022
Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code Daniel Deutsch Dan Roth AI4CE 45 2 0 29 Apr 2022
Faithful to the Document or to the World? Mitigating Hallucinations via Entity-linked Knowledge in Abstractive Summarization Yue Dong John Wieting Pat Verga HILM 24 24 0 28 Apr 2022
On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models? Nouha Dziri Sivan Milton Mo Yu Osmar Zaiane Siva Reddy HILM 19 188 0 17 Apr 2022
Evaluating Factuality in Text Simplification Ashwin Devaraj William Sheffield Byron C. Wallace Junyi Jessy Li HILM 27 41 0 15 Apr 2022
NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias Nayeon Lee Yejin Bang Tiezheng Yu Andrea Madotto Pascale Fung 25 24 0 11 Apr 2022
Evaluation of Automatic Text Summarization using Synthetic Facts J. Ahn Foaad Khosmood HILM 18 0 0 11 Apr 2022
Probing Factually Grounded Content Transfer with Factual Ablation Peter West Chris Quirk Michel Galley Yejin Choi HILM 30 9 0 18 Mar 2022
Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search Daniel King Zejiang Shen Nishant Subramani Daniel S. Weld Iz Beltagy Doug Downey HILM 28 31 0 16 Mar 2022
Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking Jamin Shin Hangyeol Yu Hyeongdon Moon Andrea Madotto Juneyoung Park 30 29 0 03 Mar 2022
Read before Generate! Faithful Long Form Question Answering with Machine Reading Dan Su Xiaoguang Li Jindi Zhang Lifeng Shang Xin Jiang Qun Liu Pascale Fung HILM 19 59 0 01 Mar 2022
Learning Cluster Patterns for Abstractive Summarization Sung-Guk Jo Jeong-Jae Kim Byung-Won On 21 3 0 22 Feb 2022
Survey of Hallucination in Natural Language Generation Ziwei Ji Nayeon Lee Rita Frieske Tiezheng Yu D. Su ... Delong Chen Wenliang Dai Ho Shu Chan Andrea Madotto Pascale Fung HILM LRM 67 2,243 0 08 Feb 2022
DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence Wei-Ye Zhao Michael Strube Steffen Eger 27 37 0 26 Jan 2022
Measuring Attribution in Natural Language Generation Models Hannah Rashkin Vitaly Nikolaev Matthew Lamm Lora Aroyo Michael Collins Dipanjan Das Slav Petrov Gaurav Singh Tomar Iulia Turc David Reitter 39 173 0 23 Dec 2021
QuALITY: Question Answering with Long Input Texts, Yes! Richard Yuanzhe Pang Alicia Parrish Nitish Joshi Nikita Nangia Jason Phang ... Vishakh Padmakumar Johnny Ma Jana Thompson He He Sam Bowman RALM 30 141 0 16 Dec 2021
Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures Eugene Bagdasaryan Vitaly Shmatikov SILM AAML 27 78 0 09 Dec 2021
Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices Hariom A. Pandya Brijesh S. Bhatt 40 27 0 07 Dec 2021
TODSum: Task-Oriented Dialogue Summarization with State Tracking Lulu Zhao Fujia Zheng Keqing He Weihao Zeng Yuejie Lei Huixing Jiang Wei Wu Weiran Xu Jun Guo Fanyu Meng 42 23 0 25 Oct 2021
Explainable Fact-checking through Question Answering Jing Yang D. Vega-Oliveros Taís Seibt Anderson de Rezende Rocha HILM 27 14 0 11 Oct 2021
Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries Xiangru Tang Alexander R. Fabbri Haoran Li Ziming Mao Griffin Adams Borui Wang Asli Celikyilmaz Yashar Mehdad Dragomir R. Radev HILM 13 19 0 19 Sep 2021
Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation Mingkai Deng Bowen Tan Zhengzhong Liu Eric P. Xing Zhiting Hu 16 72 0 14 Sep 2021
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization Mengyao Cao Yue Dong Jackie C.K. Cheung HILM 178 146 0 30 Aug 2021
Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation Yuexiang Xie Fei Sun Yang Deng Yaliang Li Bolin Ding HILM 26 53 0 30 Aug 2021
EmailSum: Abstractive Email Thread Summarization Shiyue Zhang Asli Celikyilmaz Jianfeng Gao Joey Tianyi Zhou 27 38 0 30 Jul 2021
To Point or Not to Point: Understanding How Abstractive Summarizers Paraphrase Text Matthew Wilber William Timkey Marten van Schijndel 21 8 0 03 Jun 2021
Focus Attention: Promoting Faithfulness and Diversity in Summarization Rahul Aralikatte Shashi Narayan Joshua Maynez S. Rothe Ryan T. McDonald 35 45 0 25 May 2021
Towards Human-Free Automatic Quality Evaluation of German Summarization Neslihan Iskender Oleg V. Vasilyev Tim Polzehl John Bohannon Sebastian Möller 29 1 0 13 May 2021
Improving Factual Consistency of Abstractive Summarization via Question Answering Feng Nan Cicero Nogueira dos Santos Henghui Zhu Patrick K. L. Ng Kathleen McKeown Ramesh Nallapati Dejiao Zhang Zhiguo Wang Andrew O. Arnold Bing Xiang HILM 14 82 0 10 May 2021
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark Nouha Dziri Hannah Rashkin Tal Linzen David Reitter ALM 195 79 0 30 Apr 2021
The Factual Inconsistency Problem in Abstractive Text Summarization: A Survey Yi-Chong Huang Xiachong Feng Xiaocheng Feng Bing Qin HILM 136 105 0 30 Apr 2021
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics Artidoro Pagnoni Vidhisha Balachandran Yulia Tsvetkov HILM 231 306 0 27 Apr 2021
Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection Sihao Chen Fan Zhang Kazoo Sone Dan Roth HILM 47 104 0 19 Apr 2021
$$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering$ $Q^{2}$ : Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering Or Honovich Leshem Choshen Roee Aharoni Ella Neeman Idan Szpektor Omri Abend HILM 33 138 0 16 Apr 2021
What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization Griffin Adams Emily Alsentzer Mert Ketenci Jason Zucker Noémie Elhadad 50 47 0 12 Apr 2021
Annotating and Modeling Fine-grained Factuality in Summarization Tanya Goyal Greg Durrett HILM 18 153 0 09 Apr 2021
A New Approach to Overgenerating and Scoring Abstractive Summaries Kaiqiang Song Bingqing Wang Z. Feng Fei Liu 22 17 0 05 Apr 2021
QuestEval: Summarization Asks for Fact-based Evaluation Thomas Scialom Paul-Alexis Dray Patrick Gallinari Sylvain Lamprier Benjamin Piwowarski Jacopo Staiano Alex Jinpeng Wang HILM 16 267 0 23 Mar 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics Sebastian Gehrmann Tosin P. Adewumi Karmanya Aggarwal Pawan Sasanka Ammanamanchi Aremu Anuoluwapo ... Nishant Subramani Wei-ping Xu Diyi Yang Akhila Yerukola Jiawei Zhou VLM 260 285 0 02 Feb 2021
What Makes a Good and Useful Summary? Incorporating Users in Automatic Summarization Research Maartje ter Hoeve Julia Kiseleva Maarten de Rijke 33 7 0 14 Dec 2020
Detecting Hallucinated Content in Conditional Neural Sequence Generation Chunting Zhou Graham Neubig Jiatao Gu Mona T. Diab P. Guzmán Luke Zettlemoyer Marjan Ghazvininejad HILM 39 195 0 05 Nov 2020
Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation Yasuhide Miura Yuhao Zhang Emily Bao Tsai C. Langlotz Dan Jurafsky MedIm 157 156 0 20 Oct 2020