Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.13854
Cited By
ComCLIP: Training-Free Compositional Image and Text Matching
25 November 2022
Kenan Jiang
Xuehai He
Ruize Xu
Junfeng Fang
VLM
CLIP
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ComCLIP: Training-Free Compositional Image and Text Matching"
8 / 8 papers shown
Title
Compositional Image-Text Matching and Retrieval by Grounding Entities
Madhukar Reddy Vongala
Saurabh Srivastava
Jana Kosecka
CLIP
CoGe
VLM
41
0
0
04 May 2025
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Jackson Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Junfeng Fang
VGen
86
1
0
12 Dec 2024
CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
Sreyan Ghosh
Ashish Seth
Sonal Kumar
Utkarsh Tyagi
Chandra Kiran Reddy Evuru
S. Ramaneswaran
S. Sakshi
Oriol Nieto
R. Duraiswami
Dinesh Manocha
AuLLM
VLM
CoGe
45
23
0
12 Oct 2023
Multimodal Graph Transformer for Multimodal Question Answering
Xuehai He
Xin Eric Wang
36
7
0
30 Apr 2023
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Eslam Mohamed Bakr
Pengzhan Sun
Xiaoqian Shen
Faizan Farooq Khan
Li Erran Li
Mohamed Elhoseiny
VLM
26
76
0
11 Apr 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,171
0
28 Jan 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
340
3,726
0
11 Feb 2021
Similarity Reasoning and Filtration for Image-Text Matching
Haiwen Diao
Ying Zhang
Lingyun Ma
Huchuan Lu
240
332
0
05 Jan 2021
1