ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18930
  4. Cited By
Hallucination of Multimodal Large Language Models: A Survey
v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
    VLMLRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 261 papers shown
Title
TARAC: Mitigating Hallucination in LVLMs via Temporal Attention Real-time Accumulative Connection
TARAC: Mitigating Hallucination in LVLMs via Temporal Attention Real-time Accumulative Connection
C. Xie
Tongxuan Liu
Lei Jiang
Yuting Zeng
Jinpei Guo
Yunheng Shen
Weizhe Huang
Jing Li
Xiaohua Xu
VLM
82
0
0
05 Apr 2025
Towards Trustworthy GUI Agents: A Survey
Towards Trustworthy GUI Agents: A Survey
Yucheng Shi
Wenhao Yu
Wenlin Yao
Wenhu Chen
Ninghao Liu
96
6
0
30 Mar 2025
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Hongcheng Gao
Jiashu Qu
Jingyi Tang
Baolong Bi
Yi Liu
Hongyu Chen
Li Liang
Li Su
Qingming Huang
MLLMVLMLRM
159
6
0
25 Mar 2025
Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
Shuo Li
Jiajun Sun
Guodong Zheng
Xiaoran Fan
Yujiong Shen
...
Wenming Tan
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
AAMLVLM
195
1
0
19 Mar 2025
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Sophia Hager
David Mueller
Kevin Duh
Nicholas Andrews
154
1
0
18 Mar 2025
Do Multimodal Large Language Models Understand Welding?
Do Multimodal Large Language Models Understand Welding?
Grigorii Khvatskii
Yong Suk Lee
Corey Angst
Maria Gibbs
Robert Landers
Nitesh Chawla
AI4CE
83
1
0
18 Mar 2025
Can Large Vision Language Models Read Maps Like a Human?
Can Large Vision Language Models Read Maps Like a Human?
Shuo Xing
Zezhou Sun
Shuangyu Xie
Kaiyuan Chen
Yanjia Huang
Yuping Wang
Jiachen Li
Dezhen Song
Zhengzhong Tu
145
8
0
18 Mar 2025
RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration
RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration
Hong Qing Yu
Frank McQuade
107
3
0
14 Mar 2025
ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content
Bhavik Chandna
Mariam Aboujenane
Usman Naseem
102
0
0
13 Mar 2025
Taxonomic Reasoning for Rare Arthropods: Combining Dense Image Captioning and RAG for Interpretable Classification
Nathaniel Lesperance
S. Ratnasingham
Graham W. Taylor
VLM
128
0
0
13 Mar 2025
Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation
Beitao Chen
Xinyu Lyu
Lianli Gao
Jingkuan Song
Jikang Cheng
195
1
0
11 Mar 2025
Hallucinatory Image Tokens: A Training-free EAZY Approach on Detecting and Mitigating Object Hallucinations in LVLMs
Hallucinatory Image Tokens: A Training-free EAZY Approach on Detecting and Mitigating Object Hallucinations in LVLMs
Liwei Che
Tony Qingze Liu
Jing Jia
Weiyi Qin
Ruixiang Tang
Vladimir Pavlovic
MLLMVLM
204
2
0
10 Mar 2025
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
Cong Chen
Mingyu Liu
Chenchen Jing
Y. Zhou
Fengyun Rao
Hao Chen
Bo Zhang
Chunhua Shen
MLLMAAMLVLM
129
5
0
09 Mar 2025
TPC: Cross-Temporal Prediction Connection for Vision-Language Model Hallucination Reduction
Chao Wang
Weiwei Fu
Yang Zhou
MLLMVLM
141
0
0
06 Mar 2025
MCiteBench: A Multimodal Benchmark for Generating Text with Citations
MCiteBench: A Multimodal Benchmark for Generating Text with Citations
Caiyu Hu
Yikai Zhang
Tinghui Zhu
Yiwei Ye
Yanghua Xiao
192
0
0
04 Mar 2025
Evaluating and Predicting Distorted Human Body Parts for Generated Images
Lu Ma
Kaibo Cao
Hao Liang
Jiaxin Lin
Zhiyu Li
Yuhong Liu
Jihong Zhang
Wentao Zhang
Tengjiao Wang
MedIm
97
0
0
02 Mar 2025
HalCECE: A Framework for Explainable Hallucination Detection through Conceptual Counterfactuals in Image Captioning
HalCECE: A Framework for Explainable Hallucination Detection through Conceptual Counterfactuals in Image Captioning
Maria Lymperaiou
Giorgos Filandrianos
Angeliki Dimitriou
Athanasios Voulodimos
Giorgos Stamou
MLLM
56
0
0
01 Mar 2025
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Wei Suo
Lijun Zhang
Mengyang Sun
Lin Yuanbo Wu
Peng Wang
Yize Zhang
MLLMVLM
108
3
0
01 Mar 2025
Towards Statistical Factuality Guarantee for Large Vision-Language Models
Towards Statistical Factuality Guarantee for Large Vision-Language Models
Zechao Li
Chao Yan
Nicholas J. Jackson
Wendi Cui
B. Li
Jiaxin Zhang
Bradley Malin
143
0
0
27 Feb 2025
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
S M Sarwar
130
1
0
25 Feb 2025
Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation
Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation
Trevine Oorloff
Yaser Yacoob
Abhinav Shrivastava
72
0
0
24 Feb 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
165
9
0
21 Feb 2025
Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding
Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding
Kyungmin Min
Minbeom Kim
Kang-il Lee
Dongryeol Lee
Kyomin Jung
MLLM
184
7
0
20 Feb 2025
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Yue Yang
Ajay Patel
Matt Deitke
Tanmay Gupta
Luca Weihs
...
Mark Yatskar
Chris Callison-Burch
Ranjay Krishna
Aniruddha Kembhavi
Christopher Clark
SyDa
207
3
0
20 Feb 2025
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Shuo Xing
Yuping Wang
Peiran Li
Ruizheng Bai
Yansen Wang
Chan-wei Hu
Chengxuan Qian
Huaxiu Yao
Zhengzhong Tu
187
8
0
18 Feb 2025
Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
Junda Wu
Yuxin Xiong
Xintong Li
Yu Xia
Ruoyu Wang
...
Sungchul Kim
Ryan Rossi
Lina Yao
Jingbo Shang
Julian McAuley
CLLVLM
136
0
0
17 Feb 2025
Valuable Hallucinations: Realizable Non-realistic Propositions
Valuable Hallucinations: Realizable Non-realistic Propositions
Qiucheng Chen
Bo Wang
LRM
138
0
0
16 Feb 2025
MRAMG-Bench: A Comprehensive Benchmark for Advancing Multimodal Retrieval-Augmented Multimodal Generation
MRAMG-Bench: A Comprehensive Benchmark for Advancing Multimodal Retrieval-Augmented Multimodal Generation
Qinhan Yu
Zhiyou Xiao
Binghui Li
Zhengren Wang
Chong Chen
Wentao Zhang
RALMVLM
262
0
0
06 Feb 2025
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
Zhuowei Li
Haizhou Shi
Yunhe Gao
Di Liu
Zhenting Wang
Yuxiao Chen
Ting Liu
Long Zhao
Hao Wang
Dimitris N. Metaxas
MLLM
90
3
0
05 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Carla P. Gomes
B. Selman
Qingsong Wen
LRM
223
17
0
05 Feb 2025
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
H. Malik
Fahad Shamshad
Muzammal Naseer
Karthik Nandakumar
Fahad Shahbaz Khan
Salman Khan
AAMLMLLMVLM
137
1
0
03 Feb 2025
Visual Attention Never Fades: Selective Progressive Attention ReCalibration for Detailed Image Captioning in Multimodal Large Language Models
Visual Attention Never Fades: Selective Progressive Attention ReCalibration for Detailed Image Captioning in Multimodal Large Language Models
Mingi Jung
Saehuyng Lee
Eunji Kim
Sungroh Yoon
565
2
0
03 Feb 2025
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Xiaokang Chen
Zhiyu Wu
Xingchao Liu
Zizheng Pan
Wen Liu
Zhenda Xie
X. Yu
Chong Ruan
AI4TS
175
160
0
29 Jan 2025
Learning to Summarize from LLM-generated Feedback
Learning to Summarize from LLM-generated Feedback
Hwanjun Song
Taewon Yun
Yuho Lee
Jihwan Oh
Gihun Lee
Jason (Jinglun) Cai
Hang Su
225
10
0
28 Jan 2025
Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
Yining Wang
Mi Zhang
Junjie Sun
Chenyue Wang
Min Yang
Hui Xue
Jialing Tao
Ranjie Duan
Qingbin Liu
65
2
0
28 Jan 2025
EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models
Andrés Villa
Juan Carlos León Alcázar
Motasem Alfarra
Vladimir Araujo
Alvaro Soto
Bernard Ghanem
VLM
46
2
0
06 Jan 2025
A Review of Multimodal Explainable Artificial Intelligence: Past,
  Present and Future
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
Jing Liu
N. Shah
Ping Chen
157
6
0
18 Dec 2024
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Ido Cohen
Daniela Gottesman
Mor Geva
Raja Giryes
VLM
186
1
1
18 Dec 2024
Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Jinghan He
Kuan Zhu
Haiyun Guo
Sihang Li
Zhenglin Hua
Yuheng Jia
Ming Tang
Tat-Seng Chua
Jinqiao Wang
VLM
144
5
0
18 Dec 2024
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Le Yang
Ziwei Zheng
Boxu Chen
Zhengyu Zhao
Chenhao Lin
Chao Shen
VLM
317
7
0
18 Dec 2024
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
Shengqiong Wu
Hao Fei
Liangming Pan
William Yang Wang
Shuicheng Yan
Tat-Seng Chua
LRM
167
1
0
15 Dec 2024
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced
  Multimodal Understanding
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Z. F. Wu
Xiaokang Chen
Zizheng Pan
Xianglong Liu
Wen Liu
...
Xingkai Yu
Haowei Zhang
Liang Zhao
Yijiao Wang
Chong Ruan
MLLMVLMMoE
199
159
0
13 Dec 2024
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
Ola Shorinwa
Zhiting Mei
Justin Lidard
Allen Z. Ren
Anirudha Majumdar
HILMLRM
153
19
0
07 Dec 2024
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large
  Vision-Language Model via Causality Analysis
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality Analysis
Po-Hsuan Huang
Jeng-Lin Li
Chin-Po Chen
Ming-Ching Chang
Wei-Chao Chen
LRM
142
1
0
04 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
Shijie Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
170
22
0
03 Dec 2024
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
Alice Heiman
Xiaoman Zhang
E. Chen
Sung Eun Kim
Pranav Rajpurkar
HILMMedIm
159
0
0
27 Nov 2024
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Lehan He
Zeren Chen
Zhelun Shi
Tianyu Yu
Jing Shao
Lu Sheng
MLLM
227
2
0
26 Nov 2024
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
Jiaqi Wang
Yifei Gao
Jitao Sang
MLLM
224
2
0
24 Nov 2024
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object
  Hallucination in Large Vision-Language Models
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models
Junzhe Chen
Tianshu Zhang
Shijie Huang
Yuwei Niu
Linfeng Zhang
Lijie Wen
Xuming Hu
MLLMVLM
505
6
0
22 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffMVGen
230
3
0
22 Nov 2024
Previous
123456
Next