Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.03084
Cited By
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
6 February 2023
Kuniaki Saito
Kihyuk Sohn
Xiang Zhang
Chun-Liang Li
Chen-Yu Lee
Kate Saenko
Tomas Pfister
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval"
28 / 78 papers shown
Title
It's All About Your Sketch: Democratising Sketch Control in Diffusion Models
Subhadeep Koley
A. Bhunia
Deeptanshu Sekhri
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
DiffM
42
16
0
12 Mar 2024
You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Subhadeep Koley
A. Bhunia
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
3DV
51
11
0
12 Mar 2024
Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval
Yongchao Du
Min Wang
Wen-gang Zhou
Shuping Hui
Houqiang Li
40
10
0
03 Mar 2024
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors
Shiyin Dong
Mingrui Zhu
Kun Cheng
Nannan Wang
Xinbo Gao
DiffM
30
3
0
29 Jan 2024
Training-free Zero-shot Composed Image Retrieval with Local Concept Reranking
Shitong Sun
Fanghua Ye
Shaogang Gong
26
13
0
14 Dec 2023
Language-only Efficient Training of Zero-shot Composed Image Retrieval
Geonmo Gu
Sanghyuk Chun
Wonjae Kim
Yoohoon Kang
Sangdoo Yun
26
14
0
04 Dec 2023
Automatic Synthetic Data and Fine-grained Adaptive Feature Alignment for Composed Person Retrieval
Delong Liu
Haiwen Li
Zhicheng Zhao
Fei Su
Fei Su
Yuan Dong
24
0
0
25 Nov 2023
Benchmarking Robustness of Text-Image Composed Retrieval
Shitong Sun
Jindong Gu
Shaogang Gong
CoGe
44
1
0
24 Nov 2023
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval
Junyang Chen
Hanjiang Lai
VLM
45
15
0
13 Nov 2023
Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service
Yuanmin Tang
Jing Yu
Keke Gai
Xiangyang Qu
Yue Hu
Gang Xiong
Qi Wu
AAML
WaLM
VLM
24
7
0
10 Nov 2023
Vision-by-Language for Training-Free Compositional Image Retrieval
Shyamgopal Karthik
Karsten Roth
Massimiliano Mancini
Zeynep Akata
CoGe
28
52
0
13 Oct 2023
Mapping Memes to Words for Multimodal Hateful Meme Classification
Giovanni Burbi
Alberto Baldrati
Lorenzo Agnolucci
Marco Bertini
A. Bimbo
24
12
0
12 Oct 2023
Sentence-level Prompts Benefit Composed Image Retrieval
Yang Bai
Xinxing Xu
Yong-Jin Liu
Salman Khan
Fahad Khan
Wangmeng Zuo
Rick Siow Mong Goh
Chun-Mei Feng
36
26
0
09 Oct 2023
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval
Yuanmin Tang
Jiahao Yu
Keke Gai
Jiamin Zhuang
Gang Xiong
Yue Hu
Qi Wu
33
33
0
28 Sep 2023
From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models
Changming Xiao
Qi Yang
Feng Zhou
Changshui Zhang
33
17
0
08 Sep 2023
Dual Relation Alignment for Composed Image Retrieval
Xintong Jiang
Yaxiong Wang
Yujiao Wu
Hao Wu
Xueming Qian
33
5
0
05 Sep 2023
CoVR: Learning Composed Video Retrieval from Web Video Captions
Lucas Ventura
Antoine Yang
Cordelia Schmid
Gül Varol
22
21
0
28 Aug 2023
Zero-shot Composed Text-Image Retrieval
Yikun Liu
Jiangchao Yao
Ya-Qin Zhang
Yanfeng Wang
Weidi Xie
27
24
0
12 Jun 2023
ConES: Concept Embedding Search for Parameter Efficient Tuning Large Vision Language Models
Huahui Yi
Ziyuan Qin
Wei Xu
Miaotian Guo
Kun Wang
Shaoting Zhang
Kang Li
Qicheng Lao
VLM
21
0
0
30 May 2023
COLA: A Benchmark for Compositional Text-to-image Retrieval
Arijit Ray
Filip Radenovic
Abhimanyu Dubey
Bryan A. Plummer
Ranjay Krishna
Kate Saenko
CoGe
VLM
41
34
0
05 May 2023
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning
Zheyuan Liu
Weixuan Sun
Yicong Hong
Damien Teney
Stephen Gould
40
30
0
29 Mar 2023
Zero-Shot Composed Image Retrieval with Textual Inversion
Alberto Baldrati
Lorenzo Agnolucci
Marco Bertini
A. Bimbo
20
102
0
27 Mar 2023
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
Geonmo Gu
Sanghyuk Chun
Wonjae Kim
HeeJae Jun
Yoohoon Kang
Sangdoo Yun
DiffM
36
50
0
21 Mar 2023
Multimodal Prompting with Missing Modalities for Visual Recognition
Yi-Lun Lee
Yi-Hsuan Tsai
Wei-Chen Chiu
Chen-Yu Lee
VPVLM
30
94
0
06 Mar 2023
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
Yiyang Chen
Zhedong Zheng
Wei Ji
Leigang Qu
Tat-Seng Chua
32
37
0
14 Nov 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,848
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
313
3,708
0
11 Feb 2021
Previous
1
2