Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.02483
Cited By
FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks
4 March 2023
Xiaoping Han
Xiatian Zhu
Licheng Yu
Li Zhang
Yi-Zhe Song
Tao Xiang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks"
22 / 22 papers shown
Title
TMCIR: Token Merge Benefits Composed Image Retrieval
Chaoyang Wang
Zeyu Zhang
Long Teng
Zijun Li
Shichao Kan
31
0
0
15 Apr 2025
NCL-CIR: Noise-aware Contrastive Learning for Composed Image Retrieval
Peng Gao
Yujian Lee
Zailong Chen
Hui Zhang
Xubo Liu
Yiyang Hu
Guquang Jing
41
0
0
06 Apr 2025
Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval
Haoqiang Lin
Haokun Wen
Xuemeng Song
Meng Liu
Yupeng Hu
Liqiang Nie
54
14
0
25 Mar 2025
Data-Efficient Generalization for Zero-shot Composed Image Retrieval
Zining Chen
Zhicheng Zhao
Fei Su
Xiaoqin Zhang
Shijian Lu
VLM
45
0
0
07 Mar 2025
Composed Multi-modal Retrieval: A Survey of Approaches and Applications
Kun Zhang
Jingyu Li
Z. Li
Jingjing Zhang
38
0
0
03 Mar 2025
A Comprehensive Survey on Composed Image Retrieval
Xuemeng Song
Haoqiang Lin
Haokun Wen
Bohan Hou
Mingzhu Xu
Liqiang Nie
53
1
0
19 Feb 2025
UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation
Xiangyu Zhao
Yuehan Zhang
Wenlong Zhang
X. Wu
41
4
0
21 Aug 2024
FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation
Riza Velioglu
Robin Chan
Barbara Hammer
29
0
0
12 Apr 2024
SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining
Chull Hwan Song
Taebaek Hwang
Jooyoung Yoon
Shunghyun Choi
Yeong Hyeon Gu
23
4
0
01 Apr 2024
Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Junyan Wang
Zhenhong Sun
Zhiyu Tan
Xuanbai Chen
Weihua Chen
Hao Li
Cheng Zhang
Yang Song
37
9
0
08 Mar 2024
Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval
Yongchao Du
Min Wang
Wen-gang Zhou
Shuping Hui
Houqiang Li
37
10
0
03 Mar 2024
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
Chun-Mei Feng
Yang Bai
Tao Luo
Zhen Li
Salman Khan
Wangmeng Zuo
Xinxing Xu
Rick Siow Mong Goh
Yong-Jin Liu
31
5
0
19 Dec 2023
Benchmarking Robustness of Text-Image Composed Retrieval
Shitong Sun
Jindong Gu
Shaogang Gong
CoGe
41
1
0
24 Nov 2023
Sentence-level Prompts Benefit Composed Image Retrieval
Yang Bai
Xinxing Xu
Yong-Jin Liu
Salman Khan
Fahad Khan
Wangmeng Zuo
Rick Siow Mong Goh
Chun-Mei Feng
36
26
0
09 Oct 2023
Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder
Zheyuan Liu
Weixuan Sun
Damien Teney
Stephen Gould
34
16
0
25 May 2023
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning
Zheyuan Liu
Weixuan Sun
Yicong Hong
Damien Teney
Stephen Gould
38
30
0
29 Mar 2023
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment
Shraman Pramanick
Li Jing
Sayan Nag
Jiachen Zhu
Hardik Shah
Yann LeCun
Ramalingam Chellappa
26
21
0
09 Oct 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
146
638
0
26 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
345
2,271
0
02 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
301
3,708
0
11 Feb 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
1