ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.16527
  4. Cited By
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
v1v2 (latest)

Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art

25 March 2024
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
    LRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art"

50 / 116 papers shown
Title
Search-Based Software Engineering in the Landscape of AI Foundation Models
Search-Based Software Engineering in the Landscape of AI Foundation Models
Hassan Sartaj
Shaukat Ali
22
0
0
26 May 2025
Adaptive Stress Testing Black-Box LLM Planners
Adaptive Stress Testing Black-Box LLM Planners
Neeloy Chakraborty
John Pohovey
Melkior Ornik
Katherine Driggs-Campbell
76
0
0
08 May 2025
RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics
RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics
Zhiyuan Zhang
Yuxin He
Yong Sun
Junyu Shi
Lijiang Liu
Qiang Nie
VLM
89
0
0
02 Apr 2025
Fine-Tuning Large Language Models to Appropriately Abstain with Semantic
  Entropy
Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy
Benedict Aaron Tjandra
Muhammed Razzak
Jannik Kossen
Kunal Handa
Yarin Gal
HILM
56
1
0
22 Oct 2024
Addressing Image Hallucination in Text-to-Image Generation through
  Factual Image Retrieval
Addressing Image Hallucination in Text-to-Image Generation through Factual Image Retrieval
Youngsun Lim
Hyunjung Shim
DiffMHILMMQ
48
4
0
15 Jul 2024
Hallucination Detection: Robustly Discerning Reliable Answers in Large
  Language Models
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
Yuyan Chen
Qiang Fu
Yichen Yuan
Zhihao Wen
Ge Fan
Dayiheng Liu
Dongmei Zhang
Zhixu Li
Yanghua Xiao
HILM
56
76
0
04 Jul 2024
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
Sijin Chen
Xin Chen
Anqi Pang
Xianfang Zeng
Wei Cheng
...
C. Zhang
Jingyi Yu
Gang Yu
Bin-Bin Fu
Tao Chen
AI4CE
90
42
0
31 May 2024
Sora Detector: A Unified Hallucination Detection for Large Text-to-Video
  Models
Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models
Zhixuan Chu
Lei Zhang
Yichen Sun
Siqiao Xue
Peng Kuang
Zhan Qin
Kui Ren
HILMEGVM
44
14
0
07 May 2024
A Moral Imperative: The Need for Continual Superalignment of Large
  Language Models
A Moral Imperative: The Need for Continual Superalignment of Large Language Models
Gokul Puthumanaillam
Manav Vora
Pranay Thangeda
Melkior Ornik
76
7
0
13 Mar 2024
Understanding the planning of LLM agents: A survey
Understanding the planning of LLM agents: A survey
Xu Huang
Weiwen Liu
Xiaolong Chen
Xingmei Wang
Hao Wang
Defu Lian
Yasheng Wang
Ruiming Tang
Enhong Chen
LLMAGLM&Ro
104
167
0
05 Feb 2024
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models
  via Over-Trust Penalty and Retrospection-Allocation
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qidong Huang
Xiao-wen Dong
Pan Zhang
Bin Wang
Conghui He
Jiaqi Wang
Dahua Lin
Weiming Zhang
Neng H. Yu
MLLM
120
197
0
29 Nov 2023
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware
  Direct Preference Optimization
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Zhiyuan Zhao
Bin Wang
Linke Ouyang
Xiao-wen Dong
Jiaqi Wang
Conghui He
MLLMVLM
103
127
0
28 Nov 2023
Multimodal Large Language Models: A Survey
Multimodal Large Language Models: A Survey
Jiayang Wu
Wensheng Gan
Zefeng Chen
Shicheng Wan
Philip S. Yu
80
187
0
22 Nov 2023
A Survey on Multimodal Large Language Models for Autonomous Driving
A Survey on Multimodal Large Language Models for Autonomous Driving
Can Cui
Yunsheng Ma
Xu Cao
Wenqian Ye
Yang Zhou
...
Xinrui Yan
Shuqi Mei
Jianguo Cao
Ziran Wang
Chao Zheng
106
283
0
21 Nov 2023
Deploying and Evaluating LLMs to Program Service Mobile Robots
Deploying and Evaluating LLMs to Program Service Mobile Robots
Zichao Hu
Francesca Lucchetti
Claire Schlesinger
Yash Saxena
Anders Freeman
Sadanand Modak
Arjun Guha
Joydeep Biswas
65
40
0
18 Nov 2023
A Language Agent for Autonomous Driving
A Language Agent for Autonomous Driving
Jiageng Mao
Junjie Ye
Yuxi Qian
Marco Pavone
Yue Wang
LM&RoLRM
68
102
0
17 Nov 2023
Large Language Models for Robotics: A Survey
Large Language Models for Robotics: A Survey
Fanlong Zeng
Wensheng Gan
Yongheng Wang
Ning Liu
Philip S. Yu
LM&Ro
165
136
0
13 Nov 2023
On the Road with GPT-4V(ision): Early Explorations of Visual-Language
  Model on Autonomous Driving
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
Licheng Wen
Xuemeng Yang
Daocheng Fu
Xiaofeng Wang
Pinlong Cai
...
Xinyu Cai
Min Dou
Shuanglu Hu
Botian Shi
Yu Qiao
VLM
84
84
0
09 Nov 2023
LLM4Drive: A Survey of Large Language Models for Autonomous Driving
LLM4Drive: A Survey of Large Language Models for Autonomous Driving
Zhenjie Yang
Xiaosong Jia
Hongyang Li
Junchi Yan
ELM
106
115
0
02 Nov 2023
LUNA: A Model-Based Universal Analysis Framework for Large Language
  Models
LUNA: A Model-Based Universal Analysis Framework for Large Language Models
Da Song
Xuan Xie
Jiayang Song
Derui Zhu
Yuheng Huang
Felix Juefei Xu
Lei Ma
ALM
64
5
0
22 Oct 2023
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large
  Language Models via Transferable Adversarial Attacks
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks
Xiaodong Yu
Hao Cheng
Xiaodong Liu
Dan Roth
Jianfeng Gao
HILMAAML
48
16
0
19 Oct 2023
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable
  Autonomous Driving
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving
Long Chen
Oleg Sinavski
Jan Hünermann
Alice Karnsund
Andrew James Willmott
Danny Birch
Daniel Maund
Jamie Shotton
MLLM
92
204
0
03 Oct 2023
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large
  Language Model
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model
Zhenhua Xu
Yujia Zhang
Enze Xie
Zhen Zhao
Yong Guo
Kwan-Yee. K. Wong
Zhenguo Li
Hengshuang Zhao
MLLM
71
292
0
02 Oct 2023
LLM Lies: Hallucinations are not Bugs, but Features as Adversarial
  Examples
LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples
Jia-Yu Yao
Kun-Peng Ning
Zhen-Hui Liu
Munan Ning
Li Yuan
HILMLRMAAML
66
190
0
02 Oct 2023
An Attentional Recurrent Neural Network for Occlusion-Aware Proactive
  Anomaly Detection in Field Robot Navigation
An Attentional Recurrent Neural Network for Occlusion-Aware Proactive Anomaly Detection in Field Robot Navigation
Jihun Han
Tianchen Ji
Yoonsang Lee
Katherine Driggs-Campbell
59
3
0
28 Sep 2023
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking
  Unrelated Questions
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
Lorenzo Pacchiardi
A. J. Chan
Sören Mindermann
Ilan Moscovitz
Alexa Y. Pan
Y. Gal
Owain Evans
J. Brauner
LLMAGHILM
67
52
0
26 Sep 2023
Can LLM-Generated Misinformation Be Detected?
Can LLM-Generated Misinformation Be Detected?
Canyu Chen
Kai Shu
DeLMO
103
177
0
25 Sep 2023
A Survey of Hallucination in Large Foundation Models
A Survey of Hallucination in Large Foundation Models
Vipula Rawte
A. Sheth
Amitava Das
HILMLRM
184
378
0
12 Sep 2023
Can you text what is happening? Integrating pre-trained language
  encoders into trajectory prediction models for autonomous driving
Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving
Ali Keysan
Andreas Look
Eitan Kosman
Gonca Gürsun
Jörg Wagner
Yu Yao
Barbara Rakitsch
80
31
0
11 Sep 2023
SayCanPay: Heuristic Planning with Large Language Models using Learnable
  Domain Knowledge
SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge
Rishi Hazra
Pedro Zuidberg Dos Martires
Luc de Raedt
LM&RoLLMAG
46
35
0
24 Aug 2023
FLIRT: Feedback Loop In-context Red Teaming
FLIRT: Feedback Loop In-context Red Teaming
Ninareh Mehrabi
Palash Goyal
Christophe Dupuy
Qian Hu
Shalini Ghosh
R. Zemel
Kai-Wei Chang
Aram Galstyan
Rahul Gupta
DiffM
52
64
0
08 Aug 2023
MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving
  at Unsignalized Intersections
MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections
Jiaqi Liu
Peng Hang
Xiao Qi
Jianqiang Wang
Jian Sun
61
46
0
30 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
299
11,894
0
18 Jul 2023
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output
  Robustness of Large Language Models
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models
Huachuan Qiu
Shuai Zhang
Anqi Li
Hongliang He
Zhenzhong Lan
ALM
63
50
0
17 Jul 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of
  LLMs by Validating Low-Confidence Generation
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
Neeraj Varshney
Wenlin Yao
Hongming Zhang
Jianshu Chen
Dong Yu
HILM
109
170
0
08 Jul 2023
Robots That Ask For Help: Uncertainty Alignment for Large Language Model
  Planners
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners
Allen Z. Ren
Anushri Dixit
Alexandra Bodrova
Sumeet Singh
Stephen Tu
...
Jacob Varley
Zhenjia Xu
Dorsa Sadigh
Andy Zeng
Anirudha Majumdar
LM&Ro
254
229
0
04 Jul 2023
Can LLMs Express Their Uncertainty? An Empirical Evaluation of
  Confidence Elicitation in LLMs
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
Miao Xiong
Zhiyuan Hu
Xinyang Lu
Yifei Li
Jie Fu
Junxian He
Bryan Hooi
198
439
0
22 Jun 2023
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
  Models
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Wei Ping
Weixin Chen
Hengzhi Pei
Chulin Xie
Mintong Kang
...
Zinan Lin
Yuk-Kit Cheng
Sanmi Koyejo
D. Song
Yue Liu
95
416
0
20 Jun 2023
CLARA: Classifying and Disambiguating User Commands for Reliable
  Interactive Robotic Agents
CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents
Jeongeun Park
Seungwon Lim
Joonhyung Lee
Sangbeom Park
Minsuk Chang
Youngjae Yu
Sungjoon Choi
LM&Ro
69
24
0
17 Jun 2023
Conformal Language Modeling
Conformal Language Modeling
Victor Quach
Adam Fisch
Tal Schuster
Adam Yala
J. Sohn
Tommi Jaakkola
Regina Barzilay
220
65
0
16 Jun 2023
Conformal Prediction with Large Language Models for Multi-Choice
  Question Answering
Conformal Prediction with Large Language Models for Multi-Choice Question Answering
Bhawesh Kumar
Cha-Chen Lu
Gauri Gupta
Anil Palepu
David R. Bellamy
Ramesh Raskar
Andrew L. Beam
81
75
0
28 May 2023
AlignScore: Evaluating Factual Consistency with a Unified Alignment
  Function
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
73
203
0
26 May 2023
PURR: Efficiently Editing Language Model Hallucinations by Denoising
  Language Model Corruptions
PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions
Anthony Chen
Panupong Pasupat
Sameer Singh
Hongrae Lee
Kelvin Guu
92
44
0
24 May 2023
NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for
  Autonomous Driving Scenario
NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario
Tianwen Qian
Jingjing Chen
Linhai Zhuo
Yang Jiao
Yueping Jiang
66
152
0
24 May 2023
Improving Factuality and Reasoning in Language Models through Multiagent
  Debate
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAGLRM
152
718
0
23 May 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILMALM
128
684
0
23 May 2023
How Language Model Hallucinations Can Snowball
How Language Model Hallucinations Can Snowball
Muru Zhang
Ofir Press
William Merrill
Alisa Liu
Noah A. Smith
HILMLRM
123
275
0
22 May 2023
Chain-of-Knowledge: Grounding Large Language Models via Dynamic
  Knowledge Adapting over Heterogeneous Sources
Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources
Xingxuan Li
Ruochen Zhao
Yew Ken Chia
Bosheng Ding
Shafiq Joty
Soujanya Poria
Lidong Bing
HILMBDLLRM
116
102
0
22 May 2023
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large
  Language Models
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
Junyi Li
Xiaoxue Cheng
Wayne Xin Zhao
J. Nie
Ji-Rong Wen
HILMVLM
73
246
0
19 May 2023
QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set
  Operations
QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations
Chaitanya Malaviya
Peter Shaw
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
108
16
0
19 May 2023
123
Next