Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.16604
Cited By
v1
v2 (latest)
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning
29 March 2023
Zheyuan Liu
Weixuan Sun
Yicong Hong
Damien Teney
Stephen Gould
Re-assign community
ArXiv (abs)
PDF
HTML
Github (31★)
Papers citing
"Bi-directional Training for Composed Image Retrieval via Text Prompt Learning"
32 / 32 papers shown
Title
DetailFusion: A Dual-branch Framework with Detail Enhancement for Composed Image Retrieval
Yuxin Yang
Yinan Zhou
Yuxin Chen
Ziqi Zhang
Zongyang Ma
...
Bing Li
Lin Song
Jun Gao
Peng Li
Weiming Hu
191
0
0
23 May 2025
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval
Junyang Chen
Hanjiang Lai
VLM
86
15
0
13 Nov 2023
Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder
Zheyuan Liu
Weixuan Sun
Damien Teney
Stephen Gould
66
18
0
25 May 2023
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
Weixuan Sun
Jiayi Zhang
Jianyuan Wang
Zheyuan Liu
Yiran Zhong
Tianpeng Feng
Yandong Guo
Yanhao Zhang
Nick Barnes
SSL
43
46
0
20 Mar 2023
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
Xiaoping Han
Xiatian Zhu
Licheng Yu
Li Zhang
Yi-Zhe Song
Tao Xiang
VLM
55
39
0
04 Mar 2023
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Kuniaki Saito
Kihyuk Sohn
Xiang Zhang
Chun-Liang Li
Chen-Yu Lee
Kate Saenko
Tomas Pfister
95
119
0
06 Feb 2023
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
197
3,482
0
16 Oct 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
279
2,885
0
25 Aug 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal
Yuval Alaluf
Yuval Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
164
1,889
0
02 Aug 2022
ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity
Ginger Delmas
Rafael Sampaio de Rezende
G. Csurka
Diane Larlus
VLM
55
102
0
15 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
547
4,398
0
28 Jan 2022
Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models
Zheyuan Liu
Cristian Rodriguez-Opazo
Damien Teney
Stephen Gould
VLM
64
203
0
09 Aug 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
579
4,077
0
18 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
967
29,731
0
26 Feb 2021
SAC: Semantic Attention Composition for Text-Conditioned Image Retrieval
Surgan Jandial
Pinkesh Badjatiya
Pranit Chawla
Ayush Chopra
Mausoom Sarkar
Balaji Krishnamurthy
54
47
0
03 Sep 2020
Modality-Agnostic Attention Fusion for visual search with text feedback
Eric Dodds
Jack Culpepper
Simão Herdade
Yang Zhang
K. Boakye
EgoV
83
74
0
30 Jun 2020
Compositional Learning of Image-Text Query for Image Retrieval
Muhammad Umer Anwaar
Egor Labintcev
M. Kleinsteuber
CoGe
62
95
0
19 Jun 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
121
1,944
0
13 Apr 2020
CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data
Youngjae Yu
Seunghwan Lee
Yuncheol Choi
Gunhee Kim
CoGe
53
37
0
27 Mar 2020
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
Hui Wu
Yupeng Gao
Xiaoxiao Guo
Ziad Al-Halah
Steven J. Rennie
Kristen Grauman
Rogerio Feris
EgoV
118
67
0
30 May 2019
LaSO: Label-Set Operations networks for multi-label few-shot learning
Amit Alfassy
Leonid Karlinsky
Amit Aides
J. Shtok
Sivan Harary
Rogerio Feris
Raja Giryes
A. Bronstein
115
118
0
26 Feb 2019
Cycle-Consistency for Robust Visual Question Answering
Meet Shah
Xinlei Chen
Marcus Rohrbach
Devi Parikh
OOD
62
190
0
15 Feb 2019
Composing Text and Image for Image Retrieval - An Empirical Odyssey
Nam S. Vo
Lu Jiang
Chen Sun
Kevin Patrick Murphy
Li Li
Li Fei-Fei
James Hays
CoGe
56
368
0
18 Dec 2018
A Corpus for Reasoning About Natural Language Grounded in Photographs
Alane Suhr
Stephanie Zhou
Ally Zhang
Iris Zhang
Huajun Bai
Yoav Artzi
LRM
106
608
0
01 Nov 2018
Mixed Precision Training
Paulius Micikevicius
Sharan Narang
Jonah Alben
G. Diamos
Erich Elsen
...
Boris Ginsburg
Michael Houston
Oleksii Kuchaiev
Ganesh Venkatesh
Hao Wu
168
1,804
0
10 Oct 2017
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAtt
AIMat
OffRL
AI4CE
356
2,230
0
22 Sep 2017
Automatic Spatially-aware Fashion Concept Discovery
Xintong Han
Zuxuan Wu
Phoenix X. Huang
Xiao Zhang
Menglong Zhu
Yuan Li
Yang Zhao
L. Davis
81
272
0
03 Aug 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
728
132,199
0
12 Jun 2017
A simple neural network module for relational reasoning
Adam Santoro
David Raposo
David Barrett
Mateusz Malinowski
Razvan Pascanu
Peter W. Battaglia
Timothy Lillicrap
GNN
NAI
189
1,615
0
05 Jun 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
311
2,386
0
20 Dec 2016
Multimodal Residual Learning for Visual QA
Jin-Hwa Kim
Sang-Woo Lee
Donghyun Kwak
Min-Oh Heo
Jeonghee Kim
Jung-Woo Ha
Byoung-Tak Zhang
53
300
0
05 Jun 2016
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
886
27,412
0
02 Dec 2015
1