Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.11477
Cited By
What's left can't be right -- The remaining positional incompetence of contrastive vision-language models
20 November 2023
Nils Hoehing
Ellen Rushe
Anthony Ventresque
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"What's left can't be right -- The remaining positional incompetence of contrastive vision-language models"
4 / 4 papers shown
Title
Dynamic Relation Inference via Verb Embeddings
Omri Suissa
Muhiim Ali
Ariana Azarbal
Hui Shen
Shekhar Pradhan
46
0
0
17 Mar 2025
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,154
0
28 Jan 2022
How Does Fine-tuning Affect the Geometry of Embedding Space: A Case Study on Isotropy
S. Rajaee
Mohammad Taher Pilehvar
79
20
0
10 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
202
405
0
13 Jul 2021
1