ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.14566
  4. Cited By
HallusionBench: An Advanced Diagnostic Suite for Entangled Language
  Hallucination and Visual Illusion in Large Vision-Language Models

HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models

23 October 2023
Tianrui Guan
Fuxiao Liu
Xiyang Wu
Ruiqi Xian
Zongxia Li
Xiaoyu Liu
Xijun Wang
Lichang Chen
Furong Huang
Yaser Yacoob
Dinesh Manocha
Dinesh Manocha
    VLM
    MLLM
ArXivPDFHTML

Papers citing "HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models"

40 / 40 papers shown
Title
Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput
Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput
Bo Zhang
Shuo Li
Runhe Tian
Yang Yang
Jixin Tang
Jinhao Zhou
Lin Ma
VLM
31
0
0
14 May 2025
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Biao Gong
Cheng Zou
Dandan Zheng
Hu Yu
Jingdong Chen
...
Qingpei Guo
Rui Liu
Weilong Chai
Xinyu Xiao
Ziyuan Huang
MLLM
79
1
0
05 May 2025
GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling
GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling
Siqi Li
Yufan Shen
Xiangnan Chen
Jiayi Chen
Hengwei Ju
...
Licheng Wen
Botian Shi
Y. Liu
Xinyu Cai
Yu Qiao
VLM
ELM
91
0
0
30 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
86
2
0
26 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
70
15
1
14 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
75
0
0
01 Apr 2025
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Jiayi Ji
Jie Lou
Debing Zhang
Rongrong Ji
95
0
0
26 Mar 2025
RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
Mehdi Moshtaghi
Siavash H. Khajavi
Joni Pajarinen
VLM
54
0
0
25 Mar 2025
Aligning Multimodal LLM with Human Preference: A Survey
Aligning Multimodal LLM with Human Preference: A Survey
Tao Yu
Yuyao Zhang
Chaoyou Fu
Junkang Wu
Jinda Lu
...
Qingsong Wen
Z. Zhang
Yan Huang
Liang Wang
Tieniu Tan
185
2
0
18 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Yansen Wang
Shengqiong Wu
Yuyao Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
92
9
0
16 Mar 2025
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models
Wenxuan Huang
Bohan Jia
Zijie Zhai
Shaosheng Cao
Zheyu Ye
Fei Zhao
Zhe Xu
Yao Hu
Shaohui Lin
MU
OffRL
LRM
MLLM
ReLM
VLM
59
45
0
09 Mar 2025
Nexus-O: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision
Che Liu
Yingji Zhang
D. Zhang
Weijie Zhang
Chenggong Gong
...
André Freitas
Qifan Wang
Z. Xu
Rongjuncheng Zhang
Yong Dai
AuLLM
76
0
0
26 Feb 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
Qingpei Guo
Kaiyou Song
Zipeng Feng
Ziping Ma
Qinglong Zhang
...
Yunxiao Sun
Tai-WeiChang
Jingdong Chen
Ming Yang
Jun Zhou
MLLM
VLM
90
3
0
26 Feb 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
Jiaheng Liu
Tao Zhang
Tao Zhang
Tian Jin
...
Jianhua Xu
Haoze Sun
Mingan Lin
Zenan Zhou
Xin Wu
AuLLM
72
12
0
28 Jan 2025
ViDDAR: Vision Language Model-Based Task-Detrimental Content Detection for Augmented Reality
ViDDAR: Vision Language Model-Based Task-Detrimental Content Detection for Augmented Reality
Yanming Xiu
T. Scargill
M. Gorlatova
70
2
0
22 Jan 2025
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Haicheng Wang
Zhemeng Yu
Gabriele Spadaro
Chen Ju
Victor Quétu
Enzo Tartaglione
Enzo Tartaglione
VLM
132
3
0
05 Jan 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao
Shiqian Su
X. Zhu
Chenyu Zhang
Zhe Chen
...
Wenhai Wang
Lewei Lu
Gao Huang
Yu Qiao
Jifeng Dai
MLLM
VLM
104
2
0
20 Dec 2024
VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
Xueqing Wu
Yuheng Ding
Bingxuan Li
Pan Lu
Da Yin
Kai-Wei Chang
Nanyun Peng
LRM
105
3
0
03 Dec 2024
Progress-Aware Video Frame Captioning
Progress-Aware Video Frame Captioning
Zihui Xue
Joungbin An
Xitong Yang
Kristen Grauman
100
1
0
03 Dec 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
109
7
0
27 Nov 2024
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Davide Paglieri
Bartłomiej Cupiał
Samuel Coward
Ulyana Piterbarg
Maciej Wolczyk
...
Lerrel Pinto
Rob Fergus
Jakob Foerster
Jack Parker-Holder
Tim Rocktaschel
LLMAG
LRM
116
10
0
20 Nov 2024
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities
Zheyuan Zhang
Fengyuan Hu
Jayjun Lee
Freda Shi
Parisa Kordjamshidi
Joyce Chai
Ziqiao Ma
59
11
0
22 Oct 2024
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
Yujun Zhou
Jingdong Yang
Kehan Guo
Pin-Yu Chen
Tian Gao
...
Tian Gao
Werner Geyer
Nuno Moniz
Nitesh V Chawla
Xiangliang Zhang
43
5
0
18 Oct 2024
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo
Xue Yang
Wenhan Dou
Zhaokai Wang
Jifeng Dai
Jifeng Dai
Yu Qiao
Xizhou Zhu
VLM
MLLM
65
25
0
10 Oct 2024
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Xin Zou
Yizhou Wang
Yibo Yan
Yuanhuiyi Lyu
Kening Zheng
...
Junkai Chen
Peijie Jiang
Jiaheng Liu
Chang Tang
Xuming Hu
89
7
0
04 Oct 2024
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
Jiacheng Zhang
Yang Jiao
Shaoxiang Chen
Jingjing Chen
Zhiyu Tan
Hao Li
Jingjing Chen
MLLM
61
18
0
25 Sep 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAG
LRM
79
13
0
16 Aug 2024
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
Sayna Ebrahimi
Sercan Ö. Arik
Tejas Nama
Tomas Pfister
44
1
0
13 Aug 2024
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
Fushuo Huo
Wenchao Xu
Zhong Zhang
Yining Qi
Zhicheng Chen
Peilin Zhao
VLM
MLLM
66
19
0
04 Aug 2024
Reflective Instruction Tuning: Mitigating Hallucinations in Large
  Vision-Language Models
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang
Teng Wang
Haigang Zhang
Ping Lu
Feng Zheng
MLLM
LRM
VLM
34
3
0
16 Jul 2024
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models
Hengyi Wang
Haizhou Shi
Shiwei Tan
Weiyi Qin
Wenyuan Wang
Tunyu Zhang
A. Nambi
T. Ganu
Hao Wang
71
15
0
17 Jun 2024
DiffuSyn Bench: Evaluating Vision-Language Models on Real-World
  Complexities with Diffusion-Generated Synthetic Benchmarks
DiffuSyn Bench: Evaluating Vision-Language Models on Real-World Complexities with Diffusion-Generated Synthetic Benchmarks
Haokun Zhou
Yipeng Hong
VLM
EGVM
31
1
0
06 Jun 2024
ChartReformer: Natural Language-Driven Chart Image Editing
ChartReformer: Natural Language-Driven Chart Image Editing
Pengyu Yan
Mahesh Bhosale
Jay Lal
Bikhyat Adhikari
David Doermann
26
7
0
01 Mar 2024
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with
  Modality Collaboration
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
Qinghao Ye
Haiyang Xu
Jiabo Ye
Mingshi Yan
Anwen Hu
Haowei Liu
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLM
VLM
126
379
0
07 Nov 2023
Large Language Models Can Be Good Privacy Protection Learners
Large Language Models Can Be Good Privacy Protection Learners
Yijia Xiao
Yiqiao Jin
Yushi Bai
Yue Wu
Xianjun Yang
...
Xujiang Zhao
Yanchi Liu
Haifeng Chen
Wei Wang
Wei Cheng
PILM
105
16
0
03 Oct 2023
Investigating the Catastrophic Forgetting in Multimodal Large Language
  Models
Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Yuexiang Zhai
Shengbang Tong
Xiao Li
Mu Cai
Qing Qu
Yong Jae Lee
Y. Ma
VLM
MLLM
CLL
77
77
0
19 Sep 2023
Sparkles: Unlocking Chats Across Multiple Images for Multimodal
  Instruction-Following Models
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
Yupan Huang
Zaiqiao Meng
Fangyu Liu
Yixuan Su
Nigel Collier
Yutong Lu
MLLM
41
22
0
31 Aug 2023
mPLUG-Owl: Modularization Empowers Large Language Models with
  Multimodality
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
208
905
0
27 Apr 2023
Instruction Tuning with GPT-4
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
162
585
0
06 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
287
4,261
0
30 Jan 2023
1