LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples

2 October 2023

Papers citing "LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples"

28 / 28 papers shown

Title
RLAP: A Reinforcement Learning Enhanced Adaptive Planning Framework for Multi-step NLP Task Solving Zepeng Ding Dixuan Wang Ziqin Luo Guochao Jiang Deqing Yang Jiaqing Liang 2 0 0 17 May 2025
Adaptive Stress Testing Black-Box LLM Planners Neeloy Chakraborty John Pohovey Melkior Ornik Katherine Driggs-Campbell 28 0 0 08 May 2025
An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination Dixiao Wei Peng Yi Jinlong Lei Yiguang Hong Yuchuan Du 176 0 0 28 Apr 2025
Deep Learning-based Intrusion Detection Systems: A Survey Zhiwei Xu Yujuan Wu Shiheng Wang Jiabao Gao Tian Qiu Ziqi Wang Hai Wan Xibin Zhao 26 1 0 10 Apr 2025
OAEI-LLM-T: A TBox Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching Zhangcheng Qiang Kerry Taylor Weiqing Wang Jing Jiang 57 0 0 25 Mar 2025
RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration Hong Qing Yu Frank McQuade 48 1 0 14 Mar 2025
From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper Sargam Yadav Asifa Mehmood Qureshi Abhishek Kaushik Shubham Sharma Roisin Loughran ... . Nikhil Singh Padraic O'Hara Pranay Jaiswal Roshan Chandru David Lillis 56 1 0 10 Mar 2025
InfoSEM: A Deep Generative Model with Informative Priors for Gene Regulatory Network Inference Tianyu Cui Song-Jun Xu Artem Moskalev Shuwei Li Tommaso Mansi Mangal Prakash Rui Liao BDL 73 0 0 06 Mar 2025
PiCO: Peer Review in LLMs based on the Consistency Optimization Kun-Peng Ning Shuo Yang Yu-Yang Liu Jia-Yu Yao Zhen-Hui Liu Yu Wang Ming Pang Li Yuan ALM 71 8 0 24 Feb 2025
Do LLMs Consider Security? An Empirical Study on Responses to Programming Questions Amirali Sajadi Binh Le A. Nguyen Kostadin Damevski Preetha Chatterjee 63 2 0 20 Feb 2025
Hallucination Detection in Large Language Models with Metamorphic Relations Borui Yang Md Afif Al Mamun Jie M. Zhang Gias Uddin HILM 66 0 0 20 Feb 2025
Unleashing the Power of Large Language Model for Denoising Recommendation Shuyao Wang Zhi Zheng Yongduo Sui Hui Xiong 111 0 0 13 Feb 2025
Personalizing Education through an Adaptive LMS with Integrated LLMs Kyle Spriggs Meng Cheng Lau Kalpdrum Passi AI4Ed 57 0 0 24 Jan 2025
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense Yang Ouyang Hengrui Gu Shuhang Lin Wenyue Hua Jie Peng B. Kailkhura Tianlong Chen Kaixiong Zhou Kaixiong Zhou AAML 31 1 0 05 Jan 2025
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models Yanwen Huang Yong Zhang Ning Cheng Zhitao Li Shaojun Wang Jing Xiao 88 0 0 02 Jan 2025
Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering Zeping Yu Sophia Ananiadou 178 0 0 17 Nov 2024
Retrieving, Rethinking and Revising: The Chain-of-Verification Can Improve Retrieval Augmented Generation Bolei He Nuo Chen Xinran He Lingyong Yan Zhenkai Wei Jinchang Luo Zhen-Hua Ling RALM LRM 30 1 0 08 Oct 2024
Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models Zikai Xie HILM LRM 61 5 0 09 Aug 2024
More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play Wichayaporn Wongkamjan Feng Gu Yanze Wang Ulf Hermjakob Jonathan May Brandon M. Stewart Jonathan K. Kummerfeld Denis Peskoff Jordan L. Boyd-Graber 53 3 0 07 Jun 2024
Evaluation of Retrieval-Augmented Generation: A Survey Hao Yu Aoran Gan Kai Zhang Shiwei Tong Qi Liu Zhaofeng Liu 3DV 62 83 0 13 May 2024
Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs Valeriia Cherepanova James Zou AAML 33 4 0 26 Apr 2024
Multicalibration for Confidence Scoring in LLMs Gianluca Detommaso Martín Bertrán Riccardo Fogliato Aaron Roth 47 12 0 06 Apr 2024
Towards a potential paradigm shift in health data collection and analysis D. J. Herzog Nitsa J. Herzog 43 0 0 01 Apr 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art Neeloy Chakraborty Melkior Ornik Katherine Driggs-Campbell LRM 57 9 0 25 Mar 2024
Development of a Reliable and Accessible Caregiving Language Model (CaLM) B. Parmanto Bayu Aryoyudanta Wilbert Soekinto Agus Setiawan Yuhan Wang Haomin Hu Andi Saptono Yong K Choi 32 0 0 11 Mar 2024
LLM Voting: Human Choices and AI Collective Decision Making Joshua C. Yang Damian Dailisan Marcin Korecki C. I. Hausladen Dirk Helbing 34 17 0 31 Jan 2024
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models Potsawee Manakul Adian Liusie Mark Gales HILM LRM 152 396 0 15 Mar 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 366 12,003 0 04 Mar 2022