The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations

8 October 2023

S.M. Towhidul Islam Tonmoy

Papers citing "The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations"

50 / 77 papers shown

Title
Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation Zhan Peng Lee Andre Lin Calvin Tan RALM HILM 32 0 0 16 May 2025
Atomic Consistency Preference Optimization for Long-Form Question Answering Jingfeng Chen Raghuveer Thirukovalluru Junlin Wang Kaiwei Luo Bhuwan Dhingra KELM HILM 20 0 0 14 May 2025
Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models Makoto Sato HILM LRM 33 1 0 01 May 2025
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes Raúl Vázquez Timothee Mickus Elaine Zosa Teemu Vahtola Jörg Tiedemann ... Liane Guillou Ona de Gibert Jaione Bengoetxea Joseph Attieh Marianna Apidianaki HILM VLM LRM 90 0 0 16 Apr 2025
MMKB-RAG: A Multi-Modal Knowledge-Based Retrieval-Augmented Generation Framework Zihan Ling Zhiyao Guo Yixuan Huang Yi An Shuai Xiao Jinsong Lan Xiaoyong Zhu Bo Zheng RALM VLM 57 0 0 14 Apr 2025
SafeChat: A Framework for Building Trustworthy Collaborative Assistants and a Case Study of its Usefulness Biplav Srivastava Kausik Lakkaraju Nitin Gupta Vansh Nagpal Bharath Muppasani Sara E. Jones 16 0 0 08 Apr 2025
Understanding the Effects of RLHF on the Quality and Detectability of LLM-Generated Texts Beining Xu Arkaitz Zubiaga DeLMO 73 0 0 23 Mar 2025
Exploring the Reliability of Self-explanation and its Relationship with Classification in Language Model-driven Financial Analysis Han Yuan Li Zhang Zheng Ma 163 0 0 20 Mar 2025
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration David Wan Justin Chih-Yao Chen Elias Stengel-Eskin Joey Tianyi Zhou LLMAG LRM 65 1 0 19 Mar 2025
Do Multimodal Large Language Models Understand Welding? Grigorii Khvatskii Yong Suk Lee Corey Angst Maria Gibbs Robert Landers Nitesh V. Chawla AI4CE 49 1 0 18 Mar 2025
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis Guy Bar-Shalom Fabrizio Frasca Derek Lim Yoav Gelberg Yftah Ziser Ran El-Yaniv Gal Chechik Haggai Maron 67 0 0 18 Mar 2025
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence Sophia Hager David Mueller Kevin Duh Nicholas Andrews 67 0 0 18 Mar 2025
ProAPO: Progressively Automatic Prompt Optimization for Visual Classification Xiangyan Qu Gaopeng Gou Jiamin Zhuang Jing Yu Kun Song Qihao Wang Yili Li Gang Xiong VLM 93 0 0 13 Mar 2025
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models Jiamin Su Yibo Yan Fangteng Fu Hao Zhang Jingheng Ye Xiang Liu Jiahao Huo Huiyu Zhou Xuming Hu ELM 57 0 0 17 Feb 2025
Risk-Aware Distributional Intervention Policies for Language Models Bao Nguyen Binh Nguyen Duy Nguyen V. Nguyen 30 1 0 28 Jan 2025
Emerging Security Challenges of Large Language Models Herve Debar Sven Dietrich Pavel Laskov Emil C. Lupu Eirini Ntoutsi ELM 29 0 0 23 Dec 2024
Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report Markus Dablander 80 0 0 18 Dec 2024
Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment Zhen Zhang Xinyu Wang Yong Jiang Zhuo Chen Feiteng Mu Mengting Hu Pengjun Xie Fei Huang KELM 59 2 0 09 Nov 2024
The Potential of LLMs in Medical Education: Generating Questions and Answers for Qualification Exams Yunqi Zhu Wen Tang Ying Sun Xuebing Yang Liyang Dou Yifan Gu Yuanyuan Wu Wensheng Zhang Ying Sun Xuebing Yang LM&MA ELM 46 1 0 31 Oct 2024
Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications Monica Riedler Stefan Langer VLM 41 12 0 29 Oct 2024
Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy Benedict Aaron Tjandra Muhammed Razzak Jannik Kossen Kunal Handa Yarin Gal HILM 33 0 0 22 Oct 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination Jerry Huang Prasanna Parthasarathi Mehdi Rezagholizadeh Boxing Chen Sarath Chandar 53 0 0 22 Oct 2024
LLMScan: Causal Scan for LLM Misbehavior Detection Mengdi Zhang Kai Kiat Goh Peixin Zhang Jun Sun Rose Lin Xin Hongyu Zhang 23 0 0 22 Oct 2024
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs Reza Fayyazi Stella Hoyos Trueba Michael Zuzak S. Yang 38 0 0 22 Oct 2024
Good Parenting is all you need -- Multi-agentic LLM Hallucination Mitigation Ted Kwartler Matthew Berman Alan Aqrawi LLMAG 26 3 0 18 Oct 2024
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization Catarina G. Belem Pouya Pezeskhpour Hayate Iso Seiji Maekawa Nikita Bhutani Estevam R. Hruschka HILM 73 1 0 17 Oct 2024
Evaluation of Attribution Bias in Retrieval-Augmented Large Language Models Amin Abolghasemi Leif Azzopardi Seyyed Hadi Hashemi Maarten de Rijke Suzan Verberne 39 0 0 16 Oct 2024
'Quis custodiet ipsos custodes?' Who will watch the watchmen? On Detecting AI-generated peer-reviews Sandeep Kumar Mohit Sahu Vardhan Gacche Tirthankar Ghosal Asif Ekbal DeLMO 32 2 0 13 Oct 2024
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide Dohun Lee Bryan S Kim Geon Yeong Park Jong Chul Ye VGen 36 1 0 06 Oct 2024
Adaptive Question Answering: Enhancing Language Model Proficiency for Addressing Knowledge Conflicts with Source Citations Sagi Shaier Ari Kobren Philip Ogren HILM 31 6 0 05 Oct 2024
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Hadas Orgad Michael Toker Zorik Gekhman Roi Reichart Idan Szpektor Hadas Kotek Yonatan Belinkov HILM AIFin 61 25 0 03 Oct 2024
FactAlign: Long-form Factuality Alignment of Large Language Models Chao-Wei Huang Yun-Nung Chen HILM 27 2 0 02 Oct 2024
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Marco Gaido Sara Papi L. Bentivogli A. Brutti Mauro Cettolo R. Gretter M. Matassoni Mohamed Nabih Matteo Negri 39 0 0 01 Oct 2024
ProSLM : A Prolog Synergized Language Model for explainable Domain Specific Knowledge Based Question Answering Priyesh Vakharia Abigail Kufeldt Max Meyers Ian Lane Leilani H. Gilpin 34 0 0 17 Sep 2024
Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games Juhwan Choi Youngbin Kim 46 0 0 10 Sep 2024
Towards Reliable Medical Question Answering: Techniques and Challenges in Mitigating Hallucinations in Language Models Duy Khoa Pham Bao Quoc Vo LM&MA HILM 31 4 0 25 Aug 2024
Analysis of Plan-based Retrieval for Grounded Text Generation Ameya Godbole Nicholas Monath Seungyeon Kim A. S. Rawat Andrew McCallum Manzil Zaheer RALM 46 2 0 20 Aug 2024
Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin Samuel Frontull Georg Moser 28 2 0 11 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey) K. Kenthapadi M. Sameki Ankur Taly HILM ELM AILaw 39 12 0 10 Jul 2024
Why does in-context learning fail sometimes? Evaluating in-context learning on open and closed questions Xiang Li Haoran Tang Siyu Chen Ziwei Wang Ryan Chen Marcin Abram LRM 31 1 0 02 Jul 2024
$$\text{Memory}^3$: Language Modeling with Explicit Memory$ $\text{Memory}^3$ : Language Modeling with Explicit Memory Hongkang Yang Zehao Lin Wenjin Wang Hao Wu Zhiyu Li ... Yu Yu Kai Chen Zhiyu Li Linpeng Tang Weinan E 50 11 0 01 Jul 2024
Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments Zhenrui Yue Huimin Zeng Lanyu Shang Yifan Liu Yang Zhang Dong Wang RALM 43 2 0 14 Jun 2024
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy Haw-Shiuan Chang Nanyun Peng Mohit Bansal Anil Ramakrishna Tagyoung Chung HILM 42 2 0 11 Jun 2024
CRAG -- Comprehensive RAG Benchmark Xiao Yang Kai Sun Hao Xin Yushi Sun Nikita Bhalla ... Nirav Shah Rakesh Wanga Anuj Kumar Wen-tau Yih Xin Luna Dong 26 24 0 07 Jun 2024
The Battle of LLMs: A Comparative Study in Conversational QA Tasks Aryan Rangapur Aman Rangapur ELM 40 7 0 28 May 2024
Oracle-Checker Scheme for Evaluating a Generative Large Language Model Y. Zeng Li-C. Wang Thomas Ibbetson 43 0 0 06 May 2024
Building a Large Japanese Web Corpus for Large Language Models Naoaki Okazaki Kakeru Hattori Hirai Shota Hiroki Iida Masanari Ohi Kazuki Fujii Taishi Nakamura Mengsay Loem Rio Yokota Sakae Mizuki 55 6 0 27 Apr 2024
Evaluating Consistency and Reasoning Capabilities of Large Language Models Yash Saxena Sarthak Chopra Arunendra Mani Tripathi ELM LRM 35 5 0 25 Apr 2024
Fakes of Varying Shades: How Warning Affects Human Perception and Engagement Regarding LLM Hallucinations Mahjabin Nahar Haeseung Seo Eun-Ju Lee Aiping Xiong Dongwon Lee HILM 37 11 0 04 Apr 2024
FACTOID: FACtual enTailment fOr hallucInation Detection Vipula Rawte S. M. Towhidul Krishnav Rajbangshi Shravani Nag Aman Chadha Amit P. Sheth Amitava Das HILM 45 3 0 28 Mar 2024