Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.03659
Cited By
Probing Conceptual Understanding of Large Visual-Language Models
7 April 2023
Madeline Chantry Schiappa
Raiyaan Abdullah
Shehreen Azad
Jared Claypoole
Michael Cogswell
Ajay Divakaran
Yogesh S Rawat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Probing Conceptual Understanding of Large Visual-Language Models"
10 / 10 papers shown
Title
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad
Vibhav Vineet
Yogesh S Rawat
VLM
163
1
0
11 Mar 2025
BloomVQA: Assessing Hierarchical Multi-modal Comprehension
Yunye Gong
Robik Shrestha
Jared Claypoole
Michael Cogswell
Arijit Ray
Christopher Kanan
Ajay Divakaran
36
0
0
20 Dec 2023
COLA: A Benchmark for Compositional Text-to-image Retrieval
Arijit Ray
Filip Radenovic
Abhimanyu Dubey
Bryan A. Plummer
Ranjay Krishna
Kate Saenko
CoGe
VLM
41
34
0
05 May 2023
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
Anuj Diwan
Layne Berry
Eunsol Choi
David Harwath
Kyle Mahowald
CoGe
108
41
0
01 Nov 2022
Unpacking Large Language Models with Conceptual Consistency
Pritish Sahu
Michael Cogswell
Yunye Gong
Ajay Divakaran
LRM
87
16
0
29 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,154
0
28 Jan 2022
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
208
310
0
02 Mar 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
299
1,084
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
331
3,708
0
11 Feb 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
79
110
0
31 Jan 2021
1