Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.10537
Cited By
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
20 December 2022
Martha Lewis
Nihal V. Nayak
Peilin Yu
Qinan Yu
Jack Merullo
Stephen H. Bach
Ellie Pavlick
VLM
OCL
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Does CLIP Bind Concepts? Probing Compositionality in Large Image Models"
48 / 48 papers shown
Title
Compositional Image-Text Matching and Retrieval by Grounding Entities
Madhukar Reddy Vongala
Saurabh Srivastava
Jana Kosecka
CLIP
CoGe
VLM
41
0
0
04 May 2025
VSC: Visual Search Compositional Text-to-Image Diffusion Model
Do Huu Dat
Nam Hyeonu
Po Yuan Mao
Tae-Hyun Oh
DiffM
CoGe
71
0
0
02 May 2025
Human-like compositional learning of visually-grounded concepts using synthetic environments
Zijun Lin
M Ganesh Kumar
Cheston Tan
OCL
CoGe
75
0
0
09 Apr 2025
Evaluating Compositional Scene Understanding in Multimodal Generative Models
Shuhao Fu
Andrew Jun Lee
Anna Wang
Ida Momennejad
Trevor Bihl
Hongjing Lu
Taylor Webb
CoGe
OCL
109
1
0
29 Mar 2025
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Davide Berasi
Matteo Farina
Massimiliano Mancini
Elisa Ricci
Nicola Strisciuglio
CoGe
68
0
0
21 Mar 2025
Dynamic Relation Inference via Verb Embeddings
Omri Suissa
Muhiim Ali
Ariana Azarbal
Hui Shen
Shekhar Pradhan
46
0
0
17 Mar 2025
On the Limitations of Vision-Language Models in Understanding Image Transforms
Ahmad Mustafa Anis
Hasnain Ali
Saquib Sarfraz
VLM
Presented at
ResearchTrend Connect | VLM
on
28 Mar 2025
151
0
0
12 Mar 2025
Is CLIP ideal? No. Can we fix it? Yes!
Raphi Kang
Yue Song
Georgia Gkioxari
Pietro Perona
VLM
63
0
0
10 Mar 2025
Bayesian Fields: Task-driven Open-Set Semantic Gaussian Splatting
Dominic Maggio
Luca Carlone
219
0
0
07 Mar 2025
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Vivek Myers
Bill Chunyuan Zheng
Anca Dragan
Kuan Fang
Sergey Levine
72
0
0
08 Feb 2025
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
79
2
0
20 Nov 2024
ResiDual Transformer Alignment with Spectral Decomposition
Lorenzo Basile
Valentino Maiorca
Luca Bortolussi
Emanuele Rodolà
Francesco Locatello
60
1
0
31 Oct 2024
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
Declan Campbell
Sunayana Rane
Tyler Giallanza
Nicolò De Sabbata
Kia Ghods
...
Alexander Ku
Steven M. Frankland
Thomas Griffiths
Jonathan D. Cohen
Taylor W. Webb
42
13
0
31 Oct 2024
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Yoshua Bengio
Guillaume Lajoie
CoGe
66
5
0
18 Oct 2024
Do Pre-trained Vision-Language Models Encode Object States?
Kaleb Newman
Shijie Wang
Yuan Zang
David Heffren
Chen Sun
CoGe
34
1
0
16 Sep 2024
Finetuning CLIP to Reason about Pairwise Differences
Dylan Sam
Devin Willmott
João Dias Semedo
J. Zico Kolter
VLM
71
3
0
15 Sep 2024
What happens to diffusion model likelihood when your model is conditional?
Mattias Cross
Anton Ragni
DiffM
42
0
0
10 Sep 2024
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
Manu Gaur
Darshan Singh
Makarand Tapaswi
186
1
0
04 Sep 2024
Relational Composition in Neural Networks: A Survey and Call to Action
Martin Wattenberg
Fernanda Viégas
CoGe
50
9
0
19 Jul 2024
Towards Compositionality in Concept Learning
Adam Stein
Aaditya Naik
Yinjun Wu
Mayur Naik
Eric Wong
CoGe
39
2
0
26 Jun 2024
Improving Interpretability and Robustness for the Detection of AI-Generated Images
T. Gaintseva
Laida Kushnareva
German Magai
Irina Piontkovskaya
Sergey I. Nikolenko
Martin Benning
S. Barannikov
Gregory Slabaugh
39
1
0
21 Jun 2024
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Wufei Ma
Guanning Zeng
Guofeng Zhang
Qihao Liu
Letian Zhang
Adam Kortylewski
Yaoyao Liu
Alan Yuille
VLM
3DV
49
7
0
13 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
73
6
0
26 May 2024
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks
Jacob Russin
Sam Whitman McGrath
Danielle J. Williams
Lotem Elber-Dorozko
AI4CE
83
3
0
24 May 2024
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation
Kevin Stangl
Marius Arvinte
Weilin Xu
Cory Cornelius
VLM
UQCV
40
0
0
13 May 2024
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
66
14
0
06 May 2024
Improving Concept Alignment in Vision-Language Concept Bottleneck Models
Nithish Muthuchamy Selvaraj
Xiaobao Guo
Bingquan Shen
A. Kong
Alex C. Kot
VLM
51
0
0
03 May 2024
Pre-trained Vision-Language Models Learn Discoverable Visual Concepts
Yuan Zang
Tian Yun
Hao Tan
Trung Bui
Chen Sun
VLM
CoGe
63
9
0
19 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas J. Guibas
Justin Johnson
Varun Jampani
40
78
0
12 Apr 2024
Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
Reza Abbasi
Mohammad Samiei
M. Rohban
M. Baghshah
VLM
CoGe
35
0
0
27 Mar 2024
If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions
Reza Esfandiarpoor
Cristina Menghini
Stephen H. Bach
CoGe
VLM
40
8
0
25 Mar 2024
Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images
Hansa Srinivasan
Candice Schumann
Aradhana Sinha
David Madras
Gbolahan O. Olanubi
Alex Beutel
Susanna Ricco
Jilin Chen
40
5
0
25 Jan 2024
Grounded learning for compositional vector semantics
Martha Lewis
CoGe
42
0
0
10 Jan 2024
FoMo Rewards: Can we cast foundation models as reward functions?
Ekdeep Singh Lubana
Johann Brehmer
P. D. Haan
Taco S. Cohen
OffRL
LRM
48
2
0
06 Dec 2023
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
39
8
0
21 Nov 2023
SelfEval: Leveraging the discriminative nature of generative models for evaluation
Sai Saketh Rambhatla
Ishan Misra
EGVM
38
4
0
17 Nov 2023
Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task
Maya Okawa
Ekdeep Singh Lubana
Robert P. Dick
Hidenori Tanaka
CoGe
DiffM
39
46
0
13 Oct 2023
Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Vishaal Udandarao
Max F. Burg
Samuel Albanie
Matthias Bethge
VLM
39
9
0
12 Oct 2023
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control
Vivek Myers
Andre Wang He
Kuan Fang
Homer Walke
Philippe Hansen-Estruch
Ching-An Cheng
Mihai Jalobeanu
Andrey Kolobov
Anca Dragan
Sergey Levine
LM&Ro
32
29
0
30 Jun 2023
Are Diffusion Models Vision-And-Language Reasoners?
Benno Krojer
Elinor Poole-Dayan
Vikram S. Voleti
Christopher Pal
Siva Reddy
45
13
0
25 May 2023
Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
Wentao Bao
Lichang Chen
Heng-Chiao Huang
Yu Kong
CoGe
VLM
33
12
0
23 May 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Kevin Clark
P. Jaini
DiffM
VLM
38
107
0
27 Mar 2023
ConceptFusion: Open-set Multimodal 3D Mapping
Krishna Murthy Jatavallabhula
Ali Kuwajerwala
Qiao Gu
Mohd. Omama
Tao Chen
...
Celso Miguel de Melo
Madhava Krishna
Liam Paull
Florian Shkurti
Antonio Torralba
35
232
0
14 Feb 2023
When are Lemons Purple? The Concept Association Bias of Vision-Language Models
Yutaro Yamada
Yingtian Tang
Yoyo Zhang
Ilker Yildirim
CoGe
26
14
0
22 Dec 2022
Compositional Generalisation with Structured Reordering and Fertility Layers
Matthias Lindemann
Alexander Koller
Ivan Titov
CoGe
40
7
0
06 Oct 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,171
0
28 Jan 2022
Memorisation versus Generalisation in Pre-trained Language Models
Michael Tänzer
Sebastian Ruder
Marek Rei
94
50
0
16 Apr 2021
From Frequency to Meaning: Vector Space Models of Semantics
Peter D. Turney
Patrick Pantel
110
2,982
0
04 Mar 2010
1