Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.14713
Cited By
v1
v2
v3 (latest)
Image-text Retrieval: A Survey on Recent Research and Development
28 March 2022
Min Cao
Shiping Li
Juntao Li
Liqiang Nie
Min Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Image-text Retrieval: A Survey on Recent Research and Development"
33 / 33 papers shown
Title
Adding simple structure at inference improves Vision-Language Compositionality
Imanol Miranda
Ander Salaberria
Eneko Agirre
Gorka Azkune
CoGe
VLM
76
0
0
11 Jun 2025
Robust Duality Learning for Unsupervised Visible-Infrared Person Re-Identification
Yongxiang Li
Yuan Sun
Yang Qin
Dezhong Peng
Xi Peng
Peng Hu
118
0
0
05 May 2025
Mimic In-Context Learning for Multimodal Tasks
Yuchu Jiang
Jiale Fu
Chenduo Hao
Xinting Hu
Yingzhe Peng
Xin Geng
Xu Yang
108
0
0
11 Apr 2025
Anatomy-Aware Conditional Image-Text Retrieval
Meng Zheng
Jiajin Zhang
Benjamin Planche
Zhongpai Gao
Terrence Chen
Ziyan Wu
MedIm
87
0
0
10 Mar 2025
Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
177
0
0
21 Dec 2024
The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models
Simone Caldarella
Massimiliano Mancini
Elisa Ricci
Rahaf Aljundi
PILM
72
2
0
02 Aug 2024
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Naoya Sogi
Takashi Shibata
Makoto Terao
VLM
82
2
0
17 Jul 2024
MMIS: Multimodal Dataset for Interior Scene Visual Generation and Recognition
Hozaifa Kassab
Ahmed Mahmoud
Mohamed Bahaa
Ammar Mohamed
Ali Hamdi
VLM
104
0
0
08 Jul 2024
Accelerating Transformers with Spectrum-Preserving Token Merging
Hoai-Chau Tran
D. M. Nguyen
Duy M. Nguyen
Trung Thanh Nguyen
Ngan Le
Pengtao Xie
Daniel Sonntag
James Y. Zou
Binh T. Nguyen
Mathias Niepert
106
13
0
25 May 2024
DEMO: A Statistical Perspective for Efficient Image-Text Matching
Fan Zhang
Xian-Sheng Hua
Chong Chen
Xiao Luo
67
0
0
19 May 2024
Improving Adversarial Transferability of Vision-Language Pre-training Models through Collaborative Multimodal Interaction
Jiyuan Fu
Zhaoyu Chen
Kaixun Jiang
Haijing Guo
Jiafeng Wang
Shuyong Gao
Wenqiang Zhang
VLM
AAML
81
4
0
16 Mar 2024
Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Shunsuke Tsubaki
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
Keisuke Imoto
65
1
0
16 Mar 2024
A Survey of Route Recommendations: Methods, Applications, and Opportunities
Shi-sheng Zhang
Zhipeng Luo
Li Yang
Fei Teng
Tian-Jie Li
AI4TS
HAI
63
12
0
01 Mar 2024
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving
Tao Tang
Dafeng Wei
Zhengyu Jia
Tian Gao
Changwei Cai
...
Yixing Zhao
Fu Liu
Xiaodan Liang
Xianpeng Lang
Yang Wang
67
7
0
02 Jan 2024
OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization
Dongchen Han
Xiaojun Jia
Yang Bai
Jindong Gu
Yang Liu
Xiaochun Cao
VLM
86
26
0
07 Dec 2023
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning
Siyuan Liang
Mingli Zhu
Aishan Liu
Baoyuan Wu
Xiaochun Cao
Ee-Chien Chang
116
58
0
20 Nov 2023
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai-Nguyen Nguyen
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
128
7
0
23 Sep 2023
State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding
Devleena Das
Sonia Chernova
Been Kim
LRM
LLMAG
105
24
0
21 Sep 2023
FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests
Ali Abdari
Alex Falcon
Giuseppe Serra
61
4
0
06 Sep 2023
Progressive Feature Mining and External Knowledge-Assisted Text-Pedestrian Image Retrieval
Huafeng Li
Shedan Yang
Yafei Zhang
Dapeng Tao
Z. Yu
76
3
0
23 Aug 2023
Mitigating Test-Time Bias for Fair Image Retrieval
Fanjie Kong
Shuai Yuan
Weituo Hao
Ricardo Henao
65
19
0
23 May 2023
RaSa: Relation and Sensitivity Aware Representation Learning for Text-based Person Search
Yang Bai
Ming-Ming Cao
Daming Gao
Ziqiang Cao
Cheng Chen
Zhenfeng Fan
Liqiang Nie
Min Zhang
AI4TS
126
61
0
23 May 2023
Text-based Person Search without Parallel Image-Text Data
Yang Bai
Wenwen Qiang
Min Cao
Cheng Chen
Ziqiang Cao
Liqiang Nie
Min Zhang
90
15
0
22 May 2023
IIITD-20K: Dense captioning for Text-Image ReID
A. V. Subramanyam
N. Sundararajan
Vibhu Dubey
Brejesh Lall
VLM
30
3
0
08 May 2023
Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information
Sunjae Kwon
Rishabh Garodia
Minhwa Lee
Zhichao Yang
Hong-ye Yu
CoGe
91
5
0
02 May 2023
Parameter-Efficient Cross-lingual Transfer of Vision and Language Models via Translation-based Alignment
Zhen Zhang
Jialu Wang
Xinze Wang
VLM
94
2
0
02 May 2023
Interpreting Vision and Language Generative Models with Semantic Visual Priors
Michele Cafagna
L. Rojas-Barahona
Kees van Deemter
Albert Gatt
FAtt
VLM
59
3
0
28 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
165
551
0
03 Apr 2023
Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening
Min Cao
Yang Bai
Wenwen Qiang
Ziqiang Cao
Liqiang Nie
Min Zhang
75
0
0
14 Mar 2023
Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching
Zheng Li
Caili Guo
Xin Eric Wang
Zerun Feng
Zhongtian Du
VLM
83
4
0
01 Mar 2023
Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles
Shuquan Ye
Yujia Xie
Dongdong Chen
Yichong Xu
Lu Yuan
Chenguang Zhu
Jing Liao
VLM
66
12
0
29 Nov 2022
Data Poisoning Attacks Against Multimodal Encoders
Ziqing Yang
Xinlei He
Zheng Li
Michael Backes
Mathias Humbert
Pascal Berrang
Yang Zhang
AAML
176
52
0
30 Sep 2022
Test-time Training for Data-efficient UCDR
Soumava Paul
Titir Dutta
Aheli Saha
Abhishek Samanta
Soma Biswas
OOD
129
0
0
19 Aug 2022
1