Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.06721
Cited By
Response Wide Shut: Surprising Observations in Basic Vision Language Model Capabilities
13 August 2024
Shivam Chandhok
Wan-Cyuan Fan
Leonid Sigal
VLM
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Response Wide Shut: Surprising Observations in Basic Vision Language Model Capabilities"
3 / 3 papers shown
Title
Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?
Antonia Wüst
Tim Nelson Tobiasch
Lukas Helff
Inga Ibs
Wolfgang Stammer
D. Dhami
Constantin Rothkopf
Kristian Kersting
CoGe
ReLM
VLM
LRM
63
1
0
25 Oct 2024
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
160
441
0
14 Oct 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
1