Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.07536
Cited By
A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering
13 November 2023
Yunxin Li
Longyue Wang
Baotian Hu
Xinyu Chen
Wanqi Zhong
Chenyang Lyu
Wei Wang
Min Zhang
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering"
7 / 7 papers shown
Title
VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Yunxin Li
Xinyu Chen
Baotian Hu
Longyue Wang
Haoyuan Shi
Min-Ling Zhang
MLLM
LRM
53
25
0
17 Jun 2024
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Yunxin Li
Shenyuan Jiang
Baotian Hu
Longyue Wang
Wanqi Zhong
Wenhan Luo
Lin Ma
Min-Ling Zhang
MoE
46
28
0
18 May 2024
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large Language Models
Bingshuai Liu
Chenyang Lyu
Zijun Min
Zhanyu Wang
Jinsong Su
Longyue Wang
LRM
31
7
0
04 Dec 2023
A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text
Yunxin Li
Baotian Hu
Yuxin Ding
Lin Ma
M. Zhang
23
5
0
03 May 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
208
900
0
27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
Linearly Mapping from Image to Text Space
Jack Merullo
Louis Castricato
Carsten Eickhoff
Ellie Pavlick
VLM
167
104
0
30 Sep 2022
1