Evaluating the Factual Consistency of Large Language Models Through News Summarization

15 November 2022

Papers citing "Evaluating the Factual Consistency of Large Language Models Through News Summarization"

23 / 23 papers shown

Title
Consistency in Language Models: Current Landscape, Challenges, and Future Directions Jekaterina Novikova Carol Anderson Borhane Blili-Hamelin Subhabrata Majumdar HILM 73 0 0 01 May 2025
The Order Effect: Investigating Prompt Sensitivity to Input Order in LLMs Bryan Guan Tanya Roosta Peyman Passban Mehdi Rezagholizadeh 99 0 0 06 Feb 2025
MEG: Medical Knowledge-Augmented Large Language Models for Question Answering Laura Cabello Carmen Martin-Turrero Uchenna Akujuobi Anders Søgaard Carlos Bobed AI4MH 154 1 0 06 Nov 2024
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models Kaushal Kumar Maurya KV Aditya Srivatsa Ekaterina Kochmar 40 2 0 16 Aug 2024
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation Bairu Hou Yang Zhang Jacob Andreas Shiyu Chang 77 5 0 11 Jun 2024
METAL: Towards Multilingual Meta-Evaluation Rishav Hada Varun Gumma Mohamed Ahmed Kalika Bali Sunayana Sitaram ELM 43 2 0 02 Apr 2024
German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset Laura Mascarell Ribin Chalumattu Annette Rios HILM 46 0 0 06 Mar 2024
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods Hanlei Jin Yang Zhang Dan Meng Jun Wang Jinghua Tan 68 80 0 05 Mar 2024
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing Bryan Wang Yuliang Li Zhaoyang Lv Haijun Xia Yan Xu Raj Sodhi 35 42 0 15 Feb 2024
Unsupervised Extractive Summarization with Learnable Length Control Strategies Renlong Jie Xiaojun Meng Xin Jiang Qun Liu 32 1 0 12 Dec 2023
InCA: Rethinking In-Car Conversational System Assessment Leveraging Large Language Models Ken E. Friedl Abbas Goher Khan S. Sahoo Md. Rony Jana Germies Christian Süß 32 3 0 13 Nov 2023
Chainpoll: A high efficacy method for LLM hallucination detection Robert Friel Atindriyo Sanyal LRM HILM 34 26 0 22 Oct 2023
Cognitive Mirage: A Review of Hallucinations in Large Language Models Hongbin Ye Tong Liu Aijia Zhang Wei Hua Weiqiang Jia HILM 48 77 0 13 Sep 2023
Semantic Consistency for Assuring Reliability of Large Language Models Harsh Raj Vipul Gupta Domenic Rosati S. Majumdar HILM 110 14 0 17 Aug 2023
Editing Common Sense in Transformers Anshita Gupta Debanjan Mondal Akshay Krishna Sheshadri Wenlong Zhao Xiang Lorraine Li Sarah Wiegreffe Niket Tandon KELM 47 22 0 24 May 2023
LM vs LM: Detecting Factual Errors via Cross Examination Roi Cohen May Hamri Mor Geva Amir Globerson HILM 41 120 0 22 May 2023
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation Yixin Liu Alexander R. Fabbri Pengfei Liu Yilun Zhao Linyong Nan ... Simeng Han Chenyu You Chien-Sheng Wu Caiming Xiong Dragomir R. Radev ALM 24 133 0 15 Dec 2022
Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization Shiyue Zhang David Wan Joey Tianyi Zhou HILM 52 27 0 08 Sep 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh Albert Webson Colin Raffel Stephen H. Bach Lintang Sutawika ... T. Bers Stella Biderman Leo Gao Thomas Wolf Alexander M. Rush LRM 215 1,661 0 15 Oct 2021
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics Artidoro Pagnoni Vidhisha Balachandran Yulia Tsvetkov HILM 231 306 0 27 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 280 3,858 0 18 Apr 2021
GO FIGURE: A Meta Evaluation of Factuality in Summarization Saadia Gabriel Asli Celikyilmaz Rahul Jha Yejin Choi Jianfeng Gao HILM 238 96 0 24 Oct 2020
Teaching Machines to Read and Comprehend Karl Moritz Hermann Tomás Kociský Edward Grefenstette L. Espeholt W. Kay Mustafa Suleyman Phil Blunsom 211 3,513 0 10 Jun 2015