Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.15269
Cited By
Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
28 January 2025
Yining Wang
Mi Zhang
Junjie Sun
Chenyue Wang
Min Yang
Hui Xue
Jialing Tao
Ranjie Duan
Qingbin Liu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink"
50 / 72 papers shown
Title
Grounded Chain-of-Thought for Multimodal Large Language Models
Qiong Wu
Xiangcong Yang
Yiyi Zhou
Chenxin Fang
Baiyang Song
Xiaoshuai Sun
Rongrong Ji
LRM
151
2
0
17 Mar 2025
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis
Bo Liu
K. Zou
Liming Zhan
Zexin Lu
Xiaoyu Dong
Yidi Chen
Chengqiang Xie
Jiannong Cao
Xiao-Ming Wu
Huazhu Fu
174
2
0
25 Nov 2024
MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning
Ziliang Gan
Yu Lu
D. Zhang
Haohan Li
Che Liu
...
Haipang Wu
Chaoyou Fu
Z. Xu
Rongjunchen Zhang
Yong Dai
92
10
0
05 Nov 2024
IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing
Kang Chen
Qingheng Zhang
Chengbao Lian
Yixin Ji
Xuwei Liu
Shuguang Han
Guoqiang Wu
Fei Huang
Jufeng Chen
61
2
0
22 Oct 2024
Interpreting and Mitigating Hallucination in MLLMs through Multi-agent Debate
Zheng Lin
Zhenxing Niu
Zhibin Wang
Yinghui Xu
60
6
0
30 Jul 2024
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Yangzhou Liu
Yue Cao
Zhangwei Gao
Weiyun Wang
Zhe Chen
...
Lewei Lu
Xizhou Zhu
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
92
26
0
22 Jul 2024
Multi-Object Hallucination in Vision-Language Models
Xuweiyi Chen
Ziqiao Ma
Xuejun Zhang
Sihan Xu
Shengyi Qian
Jianing Yang
David Fouhey
Joyce Chai
66
20
0
08 Jul 2024
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Zhongzhi Yu
Zheng Wang
Yonggan Fu
Huihong Shi
Khalid Shaikh
Yingyan Celine Lin
97
25
0
22 Jun 2024
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
Seungwoo Son
Wonpyo Park
Woohyun Han
Kyuyeun Kim
Jaeho Lee
MQ
59
13
0
17 Jun 2024
Yo'LLaVA: Your Personalized Language and Vision Assistant
Thao Nguyen
Haotian Liu
Yuheng Li
Mu Cai
Utkarsh Ojha
Yong Jae Lee
VLM
MLLM
88
22
0
13 Jun 2024
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak Attacks
Zonghao Ying
Aishan Liu
Xianglong Liu
Dacheng Tao
92
22
0
10 Jun 2024
Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models
Hao-Ran Cheng
Erjia Xiao
Jiahang Cao
Le Yang
Kaidi Xu
Jindong Gu
Renjing Xu
AAML
110
10
0
30 May 2024
Visual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing Image Character
Siyuan Ma
Weidi Luo
Yu Wang
Xiaogeng Liu
120
27
0
25 May 2024
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation
Zhiyu Tan
Mengping Yang
Luozheng Qin
Hao Yang
Ye Qian
Qiang-feng Zhou
Cheng Zhang
Hao Li
96
6
0
21 May 2024
Adversarial Robustness for Visual Grounding of Multimodal Large Language Models
Kuofeng Gao
Yang Bai
Jiawang Bai
Yong Yang
Shu-Tao Xia
AAML
79
19
0
16 May 2024
Hallucination of Multimodal Large Language Models: A Survey
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
175
181
0
29 Apr 2024
Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration
Chenwei Lin
Hanjia Lyu
Jiebo Luo
Xian Xu
LM&MA
MLLM
29
2
0
15 Apr 2024
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting
Yu Wang
Xiaogeng Liu
Yu-Feng Li
Muhao Chen
Chaowei Xiao
AAML
79
57
0
14 Mar 2024
Jailbreaking Attack against Multimodal Large Language Model
Zhenxing Niu
Haoxuan Ji
Xinbo Gao
Gang Hua
Rong Jin
72
70
0
04 Feb 2024
Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
Kuofeng Gao
Yang Bai
Jindong Gu
Shu-Tao Xia
Philip Torr
Zhifeng Li
Wei Liu
VLM
71
43
0
20 Jan 2024
Gemini Pro Defeated by GPT-4V: Evidence from Education
Gyeong-Geon Lee
Ehsan Latif
Lehong Shi
Xiaoming Zhai
68
24
0
27 Dec 2023
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Xiaoqi Li
Mingxu Zhang
Yiran Geng
Haoran Geng
Yuxing Long
Yan Shen
Renrui Zhang
Jiaming Liu
Hao Dong
LM&Ro
LRM
96
94
0
24 Dec 2023
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Chaoya Jiang
Haiyang Xu
Mengfan Dong
Jiaxing Chen
Wei Ye
Mingshi Yan
Qinghao Ye
Ji Zhang
Fei Huang
Shikun Zhang
VLM
41
57
0
12 Dec 2023
On the Robustness of Large Multimodal Models Against Image Adversarial Attacks
Xuanimng Cui
Alejandro Aparcedo
Young Kyun Jang
Ser-Nam Lim
AAML
VLM
48
43
0
06 Dec 2023
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
M. Steyvers
Yuan Yao
Haoye Zhang
Taiwen He
Yifeng Han
...
Xinyue Hu
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
Tat-Seng Chua
MLLM
VLM
179
214
0
01 Dec 2023
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qidong Huang
Xiao-wen Dong
Pan Zhang
Bin Wang
Conghui He
Jiaqi Wang
Dahua Lin
Weiming Zhang
Neng H. Yu
MLLM
120
197
0
29 Nov 2023
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Sicong Leng
Hang Zhang
Guanzheng Chen
Xin Li
Shijian Lu
Chunyan Miao
Li Bing
VLM
MLLM
141
229
0
28 Nov 2023
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Zhiyuan Zhao
Bin Wang
Linke Ouyang
Xiao-wen Dong
Jiaqi Wang
Conghui He
MLLM
VLM
103
131
0
28 Nov 2023
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data
Qifan Yu
Juncheng Li
Longhui Wei
Liang Pang
Wentao Ye
Bosheng Qin
Siliang Tang
Qi Tian
Yueting Zhuang
MLLM
VLM
93
76
0
22 Nov 2023
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
Lin Chen
Jinsong Li
Xiao-wen Dong
Pan Zhang
Conghui He
Jiaqi Wang
Feng Zhao
Dahua Lin
MLLM
VLM
188
660
0
21 Nov 2023
LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
Gongwei Chen
Leyang Shen
Rui Shao
Xiang Deng
Liqiang Nie
VLM
MLLM
86
48
0
20 Nov 2023
AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation
Junyang Wang
Yuhang Wang
Guohai Xu
Jing Zhang
Yukai Gu
...
Jiaqi Wang
Haiyang Xu
Ming Yan
Ji Zhang
Jitao Sang
MLLM
VLM
64
117
0
13 Nov 2023
Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
Seongyun Lee
Sue Hyun Park
Yongrae Jo
Minjoon Seo
50
61
0
13 Nov 2023
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Zhang Li
Biao Yang
Qiang Liu
Zhiyin Ma
Shuo Zhang
Jingxu Yang
Yabo Sun
Yuliang Liu
Xiang Bai
MLLM
91
270
0
11 Nov 2023
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
211
150
0
09 Nov 2023
DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification
Mintong Kang
Basel Alomair
Yue Liu
71
32
0
27 Oct 2023
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
Tianrui Guan
Fuxiao Liu
Xiyang Wu
Ruiqi Xian
Zongxia Li
...
Lichang Chen
Furong Huang
Yaser Yacoob
Dinesh Manocha
Dinesh Manocha
VLM
MLLM
94
182
0
23 Oct 2023
Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis
Chaoyi Wu
Jiayu Lei
Qiaoyu Zheng
Weike Zhao
Weixiong Lin
...
Xiao Zhou
Ziheng Zhao
Ya Zhang
Yanfeng Wang
Weidi Xie
LM&MA
138
78
0
15 Oct 2023
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
236
465
0
14 Oct 2023
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Jiabo Ye
Anwen Hu
Haiyang Xu
Qinghao Ye
Mingshi Yan
...
Ji Zhang
Qin Jin
Liang He
Xin Lin
Feiyan Huang
VLM
MLLM
178
91
0
08 Oct 2023
Improved Baselines with Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Yuheng Li
Yong Jae Lee
VLM
MLLM
118
2,725
0
05 Oct 2023
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model
Zhenhua Xu
Yujia Zhang
Enze Xie
Zhen Zhao
Yong Guo
Kwan-Yee. K. Wong
Zhenguo Li
Hengshuang Zhao
MLLM
71
292
0
02 Oct 2023
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
Yiyang Zhou
Chenhang Cui
Jaehong Yoon
Linjun Zhang
Zhun Deng
Chelsea Finn
Mohit Bansal
Huaxiu Yao
MLLM
109
181
0
01 Oct 2023
Efficient Streaming Language Models with Attention Sinks
Michel Lang
Yuandong Tian
Beidi Chen
Song Han
Mike Lewis
AI4TS
RALM
125
758
0
29 Sep 2023
Vision Transformers Need Registers
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
164
343
0
28 Sep 2023
Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
Erfan Shayegani
Yue Dong
Nael B. Abu-Ghazaleh
94
147
0
26 Jul 2023
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Ke Chen
Zhao Zhang
Weili Zeng
Richong Zhang
Feng Zhu
Rui Zhao
ObjD
85
638
0
27 Jun 2023
Kosmos-2: Grounding Multimodal Large Language Models to the World
Zhiliang Peng
Wenhui Wang
Li Dong
Y. Hao
Shaohan Huang
Shuming Ma
Furu Wei
MLLM
ObjD
VLM
104
755
0
26 Jun 2023
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
Fuxiao Liu
Kevin Qinghong Lin
Linjie Li
Jianfeng Wang
Yaser Yacoob
Lijuan Wang
VLM
MLLM
105
279
0
26 Jun 2023
Visual Adversarial Examples Jailbreak Aligned Large Language Models
Xiangyu Qi
Kaixuan Huang
Ashwinee Panda
Peter Henderson
Mengdi Wang
Prateek Mittal
AAML
81
158
0
22 Jun 2023
1
2
Next