ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.10355
  4. Cited By
Evaluating Object Hallucination in Large Vision-Language Models

Evaluating Object Hallucination in Large Vision-Language Models

17 May 2023
Yifan Li
Yifan Du
Kun Zhou
Jinpeng Wang
Wayne Xin Zhao
Ji-Rong Wen
    MLLM
    LRM
ArXivPDFHTML

Papers citing "Evaluating Object Hallucination in Large Vision-Language Models"

50 / 588 papers shown
Title
Prompting Medical Large Vision-Language Models to Diagnose Pathologies by Visual Question Answering
Prompting Medical Large Vision-Language Models to Diagnose Pathologies by Visual Question Answering
Danfeng Guo
Sumitaka Honji
LRM
79
0
0
31 Jul 2024
Interpreting and Mitigating Hallucination in MLLMs through Multi-agent
  Debate
Interpreting and Mitigating Hallucination in MLLMs through Multi-agent Debate
Zheng Lin
Zhenxing Niu
Zhibin Wang
Yinghui Xu
39
4
0
30 Jul 2024
Diffusion Feedback Helps CLIP See Better
Diffusion Feedback Helps CLIP See Better
Wenxuan Wang
Quan-Sen Sun
Fan Zhang
Yepeng Tang
Jing Liu
Xinlong Wang
VLM
46
14
0
29 Jul 2024
ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2
ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2
Wenjun Huang
Jiakai Pan
Jiahao Tang
Yanyu Ding
Yifei Xing
Yuhe Wang
Zhengzhuo Wang
Jianguo Hu
Mamba
45
5
0
29 Jul 2024
LLAVADI: What Matters For Multimodal Large Language Models Distillation
LLAVADI: What Matters For Multimodal Large Language Models Distillation
Shilin Xu
Xiangtai Li
Haobo Yuan
Lu Qi
Yunhai Tong
Ming-Hsuan Yang
36
3
0
28 Jul 2024
VACoDe: Visual Augmented Contrastive Decoding
VACoDe: Visual Augmented Contrastive Decoding
Sihyeon Kim
Boryeong Cho
Sangmin Bae
Sumyeong Ahn
SeYoung Yun
36
3
0
26 Jul 2024
$VILA^2$: VILA Augmented VILA
VILA2VILA^2VILA2: VILA Augmented VILA
Yunhao Fang
Ligeng Zhu
Yao Lu
Yan Wang
Pavlo Molchanov
Jang Hyun Cho
Marco Pavone
Song Han
Hongxu Yin
VLM
47
7
0
24 Jul 2024
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal
  Large Language Model
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
Yiwei Ma
Zhibin Wang
Xiaoshuai Sun
Weihuang Lin
Qiang-feng Zhou
Jiayi Ji
Rongrong Ji
MLLM
VLM
57
1
0
23 Jul 2024
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with
  Extensive Diversity
MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity
Yangzhou Liu
Yue Cao
Zhangwei Gao
Weiyun Wang
Zhe Chen
...
Lewei Lu
Xizhou Zhu
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
57
23
0
22 Jul 2024
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Ziyuan Huang
Kaixiang Ji
Biao Gong
Zhiwu Qing
Qinglong Zhang
Kecheng Zheng
Jian Wang
Jingdong Chen
Ming Yang
LRM
42
1
0
22 Jul 2024
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in
  Vision-language Models
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models
Moon Ye-Bin
Nam Hyeon-Woo
Wonseok Choi
Tae-Hyun Oh
MLLM
51
6
0
18 Jul 2024
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Tsung-Han Wu
Giscard Biamby
Jerome Quenum
Ritwik Gupta
Joseph E. Gonzalez
Trevor Darrell
David M. Chan
VLM
49
7
0
18 Jul 2024
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
Kaichen Zhang
Bo Li
Peiyuan Zhang
Fanyi Pu
Joshua Adrian Cahyono
...
Shuai Liu
Yuanhan Zhang
Jingkang Yang
Chunyuan Li
Ziwei Liu
97
76
0
17 Jul 2024
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Haodong Duan
Junming Yang
Junming Yang
Xinyu Fang
Lin Chen
...
Yuhang Zang
Pan Zhang
Jiaqi Wang
Dahua Lin
Kai Chen
LM&MA
VLM
39
115
0
16 Jul 2024
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large
  Vision-Language Models
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Runhui Huang
Xinpeng Ding
Chunwei Wang
J. N. Han
Yulong Liu
Hengshuang Zhao
Hang Xu
Lu Hou
Wei Zhang
Xiaodan Liang
VLM
31
8
0
11 Jul 2024
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal
  Perception
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Xiaotong Li
Fan Zhang
Haiwen Diao
Yueze Wang
Xinlong Wang
Ling-yu Duan
VLM
31
26
0
11 Jul 2024
A Single Transformer for Scalable Vision-Language Modeling
A Single Transformer for Scalable Vision-Language Modeling
Yangyi Chen
Xingyao Wang
Hao Peng
Heng Ji
LRM
42
15
0
08 Jul 2024
Vision-Language Models under Cultural and Inclusive Considerations
Vision-Language Models under Cultural and Inclusive Considerations
Antonia Karamolegkou
Phillip Rust
Yong Cao
Ruixiang Cui
Anders Søgaard
Daniel Hershcovich
VLM
53
7
0
08 Jul 2024
Rethinking Visual Prompting for Multimodal Large Language Models with
  External Knowledge
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Philip Torr
Lu Yuan
LRM
VLM
33
8
0
05 Jul 2024
TokenPacker: Efficient Visual Projector for Multimodal LLM
TokenPacker: Efficient Visual Projector for Multimodal LLM
Wentong Li
Yuqian Yuan
Jian Liu
Dongqi Tang
Song Wang
Jie Qin
Jianke Zhu
Lei Zhang
MLLM
37
53
0
02 Jul 2024
Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and
  Aleatoric Awareness
Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Khyathi Raghavi Chandu
Linjie Li
Anas Awadalla
Ximing Lu
Jae Sung Park
Jack Hessel
Lijuan Wang
Yejin Choi
50
2
0
02 Jul 2024
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Yusu Qian
Hanrong Ye
J. Fauconnier
Peter Grasch
Yinfei Yang
Zhe Gan
108
13
0
01 Jul 2024
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework
  for Multimodal LLMs
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Sukmin Yun
Haokun Lin
Rusiru Thushara
Mohammad Qazim Bhat
Yongxin Wang
...
Timothy Baldwin
Zhengzhong Liu
Eric P. Xing
Xiaodan Liang
Zhiqiang Shen
54
10
0
28 Jun 2024
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context
  Compression
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression
Jieneng Chen
Luoxin Ye
Ju He
Zhao-Yang Wang
Daniel Khashabi
Alan Yuille
VLM
27
5
0
28 Jun 2024
MM-Instruct: Generated Visual Instructions for Large Multimodal Model
  Alignment
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Jihao Liu
Xin Huang
Jinliang Zheng
Boxiao Liu
Jia Wang
Osamu Yoshie
Yu Liu
Hongsheng Li
MLLM
SyDa
38
3
0
28 Jun 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
56
2
0
28 Jun 2024
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and
  Understanding
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Tao Zhang
Xiangtai Li
Hao Fei
Haobo Yuan
Shengqiong Wu
Shunping Ji
Chen Change Loy
Shuicheng Yan
LRM
MLLM
VLM
49
48
0
27 Jun 2024
MM-SpuBench: Towards Better Understanding of Spurious Biases in
  Multimodal LLMs
MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs
Wenqian Ye
Guangtao Zheng
Yunsheng Ma
Xu Cao
Bolin Lai
James M. Rehg
Aidong Zhang
37
10
0
24 Jun 2024
Evaluating the Quality of Hallucination Benchmarks for Large
  Vision-Language Models
Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models
Bei Yan
Jie Zhang
Zheng Yuan
Shiguang Shan
Xilin Chen
VLM
46
4
0
24 Jun 2024
Evaluating and Analyzing Relationship Hallucinations in Large
  Vision-Language Models
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models
Mingrui Wu
Jiayi Ji
Oucheng Huang
Jiale Li
Yuhang Wu
Xiaoshuai Sun
Rongrong Ji
53
8
0
24 Jun 2024
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in
  Large Video-Language Models
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Yuxuan Wang
Yueqian Wang
Dongyan Zhao
Cihang Xie
Zilong Zheng
MLLM
VLM
52
26
0
24 Jun 2024
MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision
  Perception
MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception
Guanqun Wang
Xinyu Wei
Jiaming Liu
Ray Zhang
Yichi Zhang
Kevin Zhang
Maurice Chong
Shanghang Zhang
VLM
LRM
46
0
0
22 Jun 2024
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
Yuxuan Qiao
Haodong Duan
Xinyu Fang
Junming Yang
Lin Chen
Songyang Zhang
Jiaqi Wang
Dahua Lin
Kai Chen
LRM
45
19
0
20 Jun 2024
HIGHT: Hierarchical Graph Tokenization for Graph-Language Alignment
HIGHT: Hierarchical Graph Tokenization for Graph-Language Alignment
Yongqiang Chen
Quanming Yao
Juzheng Zhang
James Cheng
Yatao Bian
36
4
0
20 Jun 2024
Using Multimodal Large Language Models for Automated Detection of
  Traffic Safety Critical Events
Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events
M. Tami
Huthaifa I. Ashqar
Mohammed Elhenawy
42
3
0
19 Jun 2024
SpatialBot: Precise Spatial Understanding with Vision Language Models
SpatialBot: Precise Spatial Understanding with Vision Language Models
Wenxiao Cai
Yaroslav Ponomarenko
Jianhao Yuan
Xiaoqi Li
Wankou Yang
Hao Dong
Bo Zhao
VLM
56
30
0
19 Jun 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLM
VLM
43
23
0
18 Jun 2024
Unveiling Encoder-Free Vision-Language Models
Unveiling Encoder-Free Vision-Language Models
Haiwen Diao
Yufeng Cui
Xiaotong Li
Yueze Wang
Huchuan Lu
Xinlong Wang
VLM
59
29
0
17 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
85
24
0
17 Jun 2024
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for
  Robotics
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics
Wentao Yuan
Jiafei Duan
Valts Blukis
Wilbert Pumacay
Ranjay Krishna
Adithyavairavan Murali
Arsalan Mousavian
Dieter Fox
LM&Ro
50
49
0
15 Jun 2024
VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language
  Large Models
VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
Chenyu Zhou
Mengdan Zhang
Peixian Chen
Chaoyou Fu
Yunhang Shen
Xiawu Zheng
Xing Sun
Rongrong Ji
VLM
27
3
0
14 Jun 2024
Detecting and Evaluating Medical Hallucinations in Large Vision Language
  Models
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models
Jiawei Chen
Dingkang Yang
Tong Wu
Yue Jiang
Xiaolu Hou
Mingcheng Li
Shunli Wang
Dongling Xiao
Ke Li
Lihua Zhang
LM&MA
VLM
42
18
0
14 Jun 2024
Yo'LLaVA: Your Personalized Language and Vision Assistant
Yo'LLaVA: Your Personalized Language and Vision Assistant
Thao Nguyen
Haotian Liu
Yuheng Li
Mu Cai
Utkarsh Ojha
Yong Jae Lee
VLM
MLLM
64
15
0
13 Jun 2024
VLind-Bench: Measuring Language Priors in Large Vision-Language Models
VLind-Bench: Measuring Language Priors in Large Vision-Language Models
Kang-il Lee
Minbeom Kim
Seunghyun Yoon
Minsung Kim
Dongryeol Lee
Hyukhun Koh
Kyomin Jung
CoGe
VLM
92
5
0
13 Jun 2024
Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
Yi-Fan Zhang
Qingsong Wen
Chaoyou Fu
Xue Wang
Zhang Zhang
Liwen Wang
Rong Jin
34
40
0
12 Jun 2024
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images
  Interleaved with Text
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Qingyun Li
Zhe Chen
Weiyun Wang
Wenhai Wang
Shenglong Ye
...
Dahua Lin
Yu Qiao
Botian Shi
Conghui He
Jifeng Dai
VLM
OffRL
56
21
0
12 Jun 2024
Image Textualization: An Automatic Framework for Creating Accurate and
  Detailed Image Descriptions
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions
Renjie Pi
Jianshu Zhang
Jipeng Zhang
Rui Pan
Zhekai Chen
Tong Zhang
3DV
47
19
0
11 Jun 2024
Needle In A Multimodal Haystack
Needle In A Multimodal Haystack
Weiyun Wang
Shuibo Zhang
Yiming Ren
Yuchen Duan
Tiantong Li
...
Ping Luo
Yu Qiao
Jifeng Dai
Wenqi Shao
Wenhai Wang
VLM
59
17
0
11 Jun 2024
Vript: A Video Is Worth Thousands of Words
Vript: A Video Is Worth Thousands of Words
Dongjie Yang
Suyuan Huang
Chengqiang Lu
Xiaodong Han
Haoxin Zhang
Yan Gao
Yao Hu
Hai Zhao
VGen
80
22
0
10 Jun 2024
CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision
  Language Models
CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Peng Xia
Ze Chen
Juanxi Tian
Yangrui Gong
Ruibo Hou
...
Jimeng Sun
Zongyuan Ge
Gang Li
James Zou
Huaxiu Yao
MU
VLM
69
31
0
10 Jun 2024
Previous
123...678...101112
Next