Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.07536
Cited By
v1
v2 (latest)
A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering
13 November 2023
Yunxin Li
Longyue Wang
Baotian Hu
Xinyu Chen
Wanqi Zhong
Chenyang Lyu
Wei Wang
Min Zhang
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering"
16 / 16 papers shown
Title
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
Kai Zhang
Pengzhen Ren
Bingqian Lin
Junfan Lin
Shikui Ma
Hang Xu
Xiaodan Liang
61
2
0
14 Oct 2024
Explore the Hallucination on Low-level Perception for MLLMs
Yinan Sun
Zicheng Zhang
H. Wu
Xiaohong Liu
Weisi Lin
Guangtao Zhai
Xiongkuo Min
82
2
0
15 Sep 2024
VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Yunxin Li
Xinyu Chen
Baotian Hu
Longyue Wang
Haoyuan Shi
Min Zhang
MLLM
LRM
167
38
0
17 Jun 2024
INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance
Chenwei Lin
Hanjia Lyu
Xian Xu
Jiebo Luo
69
2
0
13 Jun 2024
An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging
Sulaiman Khan
Md. Rafiul Biswas
Alina Murad
Hazrat Ali
Zubair Shah
91
4
0
02 Jun 2024
Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs
Jialiang Xu
Michael Moor
J. Leskovec
66
3
0
29 May 2024
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Yunxin Li
Shenyuan Jiang
Baotian Hu
Longyue Wang
Wanqi Zhong
Wenhan Luo
Lin Ma
Min Zhang
MoE
111
42
0
18 May 2024
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
Yunxin Li
Baotian Hu
Haoyuan Shi
Wei Wang
Longyue Wang
Min Zhang
LRM
68
16
0
08 May 2024
Comp4D: LLM-Guided Compositional 4D Scene Generation
Dejia Xu
Hanwen Liang
N. Bhatt
Hezhen Hu
Hanxue Liang
Konstantinos N. Plataniotis
Zhangyang Wang
87
27
0
25 Mar 2024
Benchmarking LLMs via Uncertainty Quantification
Fanghua Ye
Mingming Yang
Jianhui Pang
Longyue Wang
Derek F. Wong
Emine Yilmaz
Shuming Shi
Zhaopeng Tu
ELM
247
59
0
23 Jan 2024
DrugAssist: A Large Language Model for Molecule Optimization
Geyan Ye
Xibao Cai
Houtim Lai
Xing Wang
Junhong Huang
Longyue Wang
Wei Liu
Xian Zeng
123
33
0
28 Dec 2023
An Evaluation of GPT-4V and Gemini in Online VQA
Mengchen Liu
Chongyan Chen
Danna Gurari
MLLM
123
7
0
17 Dec 2023
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large Language Models
Bingshuai Liu
Chenyang Lyu
Zijun Min
Zhanyu Wang
Jinsong Su
Longyue Wang
LRM
96
8
0
04 Dec 2023
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs
Yunxin Li
Baotian Hu
Wei Wang
Xiaochun Cao
Min Zhang
74
5
0
27 Nov 2023
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis
Yingshu Li
Yunyi Liu
Zhanyu Wang
Xinyu Liang
Lei Wang
Lingqiao Liu
Leyang Cui
Zhaopeng Tu
Longyue Wang
Luping Zhou
ELM
LM&MA
90
39
0
31 Oct 2023
LMEye: An Interactive Perception Network for Large Language Models
Yunxin Li
Baotian Hu
Xinyu Chen
Lin Ma
Yong-mei Xu
Hao Fei
MLLM
VLM
93
28
0
05 May 2023
1