Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.18930
Cited By
v1
v2 (latest)
Hallucination of Multimodal Large Language Models: A Survey
29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hallucination of Multimodal Large Language Models: A Survey"
50 / 261 papers shown
Title
AutoV: Learning to Retrieve Visual Prompt for Large Vision-Language Models
Yuan Zhang
Chun-Kai Fan
Tao Huang
Ming Lu
Sicheng Yu
Junwen Pan
Kuan Cheng
Qi She
Shanghang Zhang
VLM
LRM
21
0
0
19 Jun 2025
HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models
Trishna Chakraborty
Udita Ghosh
Xiaopan Zhang
Fahim Faisal Niloy
Yue Dong
Jiachen Li
Amit K. Roy-Chowdhury
Chengyu Song
LLMAG
HILM
LRM
50
0
0
18 Jun 2025
Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Ankan Deria
Adinath Madhavrao Dukre
Feilong Tang
Sara Atito
Sudipta Roy
Muhammad Awais
Muhammad Haris Khan
Imran Razzak
VLM
42
0
0
18 Jun 2025
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM
Yujun Wang
Jinhe Bi
Yunpu Ma
Soeren Pirk
MLLM
51
0
0
17 Jun 2025
From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models
Xinyang Li
Siqi Liu
Bochao Zou
Jiansheng Chen
Huimin Ma
27
0
0
17 Jun 2025
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
Jiachen Yu
Yufei Zhan
Ziheng Wu
Yousong Zhu
Jinqiao Wang
Minghui Qiu
VLM
LRM
22
0
0
13 Jun 2025
SECOND: Mitigating Perceptual Hallucination in Vision-Language Models via Selective and Contrastive Decoding
Woohyeon Park
Woojin Kim
Jaeik Kim
Jaeyoung Do
VLM
15
0
0
10 Jun 2025
Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
Ruiyang Zhang
Hu Zhang
Hao Fei
Zhedong Zheng
UQCV
42
0
0
09 Jun 2025
HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains
Shijie Wang
Yilun Zhang
Zeyu Lai
Dexing Kong
30
0
0
09 Jun 2025
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai
Yuxuan Fan
Jiantao Qiu
Fupeng Sun
Jiayi Song
Junlin Han
Zichen Liu
Conghui He
Wentao Zhang
Binhang Yuan
MLLM
VLM
28
0
0
08 Jun 2025
Ignoring Directionality Leads to Compromised Graph Neural Network Explanations
Changsheng Sun
Xinke Li
Jin Song Dong
AAML
126
0
0
05 Jun 2025
CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG
Yang Tian
Fan Liu
Jingyuan Zhang
Victoria A. Webster-Wood
Yupeng Hu
Liqiang Nie
VLM
64
0
0
03 Jun 2025
V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving
Xuewen Luo
Fengze Yang
Fan Ding
Xiangbo Gao
Shuo Xing
Yang Zhou
Zhengzhong Tu
Chenxi Liu
LRM
78
1
0
03 Jun 2025
CLAIM: Mitigating Multilingual Object Hallucination in Large Vision-Language Models with Cross-Lingual Attention Intervention
Zekai Ye
Qiming Li
Xiaocheng Feng
L. Qin
Yichong Huang
...
Zhirui Zhang
Yunfei Lu
Duyu Tang
Dandan Tu
Bing Qin
VLM
LRM
15
0
0
03 Jun 2025
Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Yangui Fang
Baixu Cheng
Jing Peng
Xu Li
Yu Xi
Chengwei Zhang
Guohui Zhong
45
0
0
30 May 2025
mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation
Chan-wei Hu
Yueqi Wang
Shuo Xing
Chia-Ju Chen
Zhengzhong Tu
3DV
28
1
0
29 May 2025
MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence Calibration
Zhitao He
Sandeep Polisetty
Zhiyuan Fan
Yuchen Huang
Shujin Wu
Yi R.
LRM
72
2
0
29 May 2025
Preemptive Hallucination Reduction: An Input-Level Approach for Multimodal Language Model
Nokimul Hasan Arif
Shadman Rabby
Md Hefzul Hossain Papon
Sabbir Ahmed
MLLM
VLM
47
0
0
29 May 2025
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
Zijing Hu
Fengda Zhang
Kun Kuang
63
1
0
28 May 2025
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction
Yuchi Wang
Yishuo Cai
Shuhuai Ren
Sihan Yang
Linli Yao
Yuanxin Liu
Y. Zhang
Pengfei Wan
Xu Sun
VLM
64
0
0
28 May 2025
Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning
Chunyi Peng
Zhipeng Xu
Zhenghao Liu
Yishan Li
Yukun Yan
...
Zhiyuan Liu
Yu Gu
Minghe Yu
Ge Yu
Maosong Sun
LRM
95
1
0
28 May 2025
Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration
Mehrdad Fazli
Bowen Wei
Ziwei Zhu
VLM
212
0
0
27 May 2025
Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models
Xinmiao Hu
C. Wang
Ruihe An
ChenYu Shao
Xiaojun Ye
Sheng Zhou
Liangcheng Li
MLLM
LRM
65
0
0
26 May 2025
The Mirage of Multimodality: Where Truth is Tested and Honesty Unravels
Jiaming Ji
Sitong Fang
Wenjing Cao
Jiahao Li
Xuyao Wang
Juntao Dai
Chi-Min Chan
Sirui Han
Yike Guo
Yaodong Yang
LRM
29
0
0
26 May 2025
ChartLens: Fine-grained Visual Attribution in Charts
Manan Suri
Puneet Mathur
Nedim Lipka
Franck Dernoncourt
Ryan Rossi
Dinesh Manocha
34
0
0
25 May 2025
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models
Yongheng Zhang
Xu Liu
Ruoxi Zhou
Qiguang Chen
Hao Fei
Wenpeng Lu
L. Qin
HILM
LRM
33
0
0
25 May 2025
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
Sicheng Feng
Song Wang
Shuyi Ouyang
Lingdong Kong
Zikai Song
Jianke Zhu
Huan Wang
Xinchao Wang
LRM
108
0
0
24 May 2025
REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing
Weihan Xu
Yimeng Ma
Jingyue Huang
Yang Li
Wenye Ma
Taylor Berg-Kirkpatrick
Julian McAuley
Paul Pu Liang
Hao-Wen Dong
DiffM
VGen
182
0
0
24 May 2025
EVADE: Multimodal Benchmark for Evasive Content Detection in E-Commerce Applications
Ancheng Xu
Zhihao Yang
Junlin Li
Guanghu Yuan
Longze Chen
...
Zhen Qin
Hengyun Chang
Hamid Alinejad-Rokny
Bo Zheng
Min Yang
AAML
62
0
0
23 May 2025
Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
Jiachen Jiang
Jinxin Zhou
Bo Peng
Xia Ning
Zhihui Zhu
105
0
0
22 May 2025
OViP: Online Vision-Language Preference Learning
Shujun Liu
Siyuan Wang
Zejun Li
Jianxiang Wang
Cheng Zeng
Zhongyu Wei
MLLM
VLM
76
0
0
21 May 2025
MMaDA: Multimodal Large Diffusion Language Models
Ling Yang
Ye Tian
Bowen Li
Xinchen Zhang
Ke Shen
Yunhai Tong
Mengdi Wang
VLM
LRM
141
6
0
21 May 2025
Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation
Xiaozhao Liu
Dinggang Shen
Xihui Liu
86
0
0
21 May 2025
Incentivizing Truthful Language Models via Peer Elicitation Games
Baiting Chen
Tong Zhu
Jiale Han
Lexin Li
Gang Li
Xiaowu Dai
124
0
0
19 May 2025
Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models
Xinlong Chen
Yuanxing Zhang
Qiang Liu
Junfei Wu
Fuzheng Zhang
Tieniu Tan
MLLM
129
0
0
17 May 2025
Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models
Bidur Khanal
Sandesh Pokhrel
Sanjay Bhandari
Ramesh Rana
Nikesh Shrestha
Ram Bahadur Gurung
Cristian A. Linte
Angus Watson
Yash Raj Shrestha
Binod Bhattarai
VLM
83
0
0
11 May 2025
Adaptive Stress Testing Black-Box LLM Planners
Neeloy Chakraborty
John Pohovey
Melkior Ornik
Katherine Driggs-Campbell
82
0
0
08 May 2025
Perceiving Beyond Language Priors: Enhancing Visual Comprehension and Attention in Multimodal Models
Aarti Ghatkesar
Uddeshya Upadhyay
VLM
93
1
0
08 May 2025
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li
Xiyang Wu
Guangyao Shi
Yubin Qin
Hongyang Du
Tianyi Zhou
Dinesh Manocha
Jordan Lee Boyd-Graber
MLLM
148
1
0
02 May 2025
HyPerAlign: Interpretable Personalized LLM Alignment via Hypothesis Generation
Cristina Garbacea
Chenhao Tan
147
0
0
29 Apr 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Jiarui Ye
Hao Tang
LM&MA
189
0
0
29 Apr 2025
CAMU: Context Augmentation for Meme Understanding
Girish A. Koushik
Diptesh Kanojia
Helen Treharne
Aditya Joshi
VLM
156
0
0
24 Apr 2025
Multimodal Large Language Models for Enhanced Traffic Safety: A Comprehensive Review and Future Trends
M. Tami
Mohammed Elhenawy
Huthaifa I. Ashqar
90
0
0
21 Apr 2025
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
Chung-En
Hsuan-Chih
Chen
Brian Jalaian
Nathaniel D. Bastian
AAML
VLM
80
1
0
19 Apr 2025
Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training
Xinsong Zhang
Yarong Zeng
Xinting Huang
Hu Hu
Runquan Xie
Han Hu
Zhanhui Kang
MLLM
VLM
269
2
0
17 Apr 2025
AeroLite: Tag-Guided Lightweight Generation of Aerial Image Captions
Xing Zi
Tengjun Ni
Xianjing Fan
Xian Tao
Jun Li
Ali Braytee
Mukesh Prasad
55
0
0
13 Apr 2025
Data Metabolism: An Efficient Data Design Schema For Vision Language Model
Jingyuan Zhang
Hongzhi Zhang
Zhou Haonan
Chenxi Sun
Xingguang Ji
Jiakang Wang
Fanheng Kong
Yang Liu
Qi Wang
Fuzheng Zhang
VLM
145
2
0
10 Apr 2025
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Wei Chen
Xin Yan
Bin Wen
Fan Yang
Yan Li
Di Zhang
Long Chen
MLLM
189
0
0
09 Apr 2025
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
85
0
0
09 Apr 2025
Explaining Low Perception Model Competency with High-Competency Counterfactuals
Sara Pohland
Claire Tomlin
DiffM
AAML
97
0
0
07 Apr 2025
1
2
3
4
5
6
Next