ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14713
  4. Cited By
Image-text Retrieval: A Survey on Recent Research and Development
v1v2v3 (latest)

Image-text Retrieval: A Survey on Recent Research and Development

28 March 2022
Min Cao
Shiping Li
Juntao Li
Liqiang Nie
Min Zhang
ArXiv (abs)PDFHTML

Papers citing "Image-text Retrieval: A Survey on Recent Research and Development"

33 / 33 papers shown
Title
Adding simple structure at inference improves Vision-Language Compositionality
Adding simple structure at inference improves Vision-Language Compositionality
Imanol Miranda
Ander Salaberria
Eneko Agirre
Gorka Azkune
CoGeVLM
76
0
0
11 Jun 2025
Robust Duality Learning for Unsupervised Visible-Infrared Person Re-Identification
Robust Duality Learning for Unsupervised Visible-Infrared Person Re-Identification
Yongxiang Li
Yuan Sun
Yang Qin
Dezhong Peng
Xi Peng
Peng Hu
118
0
0
05 May 2025
Mimic In-Context Learning for Multimodal Tasks
Mimic In-Context Learning for Multimodal Tasks
Yuchu Jiang
Jiale Fu
Chenduo Hao
Xinting Hu
Yingzhe Peng
Xin Geng
Xu Yang
108
0
0
11 Apr 2025
Anatomy-Aware Conditional Image-Text Retrieval
Meng Zheng
Jiajin Zhang
Benjamin Planche
Zhongpai Gao
Terrence Chen
Ziyan Wu
MedIm
87
0
0
10 Mar 2025
Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive
  Learning with Dense Labeling
Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
177
0
0
21 Dec 2024
The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models
The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models
Simone Caldarella
Massimiliano Mancini
Elisa Ricci
Rahaf Aljundi
PILM
72
2
0
02 Aug 2024
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Naoya Sogi
Takashi Shibata
Makoto Terao
VLM
82
2
0
17 Jul 2024
MMIS: Multimodal Dataset for Interior Scene Visual Generation and
  Recognition
MMIS: Multimodal Dataset for Interior Scene Visual Generation and Recognition
Hozaifa Kassab
Ahmed Mahmoud
Mohamed Bahaa
Ammar Mohamed
Ali Hamdi
VLM
104
0
0
08 Jul 2024
Accelerating Transformers with Spectrum-Preserving Token Merging
Accelerating Transformers with Spectrum-Preserving Token Merging
Hoai-Chau Tran
D. M. Nguyen
Duy M. Nguyen
Trung Thanh Nguyen
Ngan Le
Pengtao Xie
Daniel Sonntag
James Y. Zou
Binh T. Nguyen
Mathias Niepert
106
13
0
25 May 2024
DEMO: A Statistical Perspective for Efficient Image-Text Matching
DEMO: A Statistical Perspective for Efficient Image-Text Matching
Fan Zhang
Xian-Sheng Hua
Chong Chen
Xiao Luo
67
0
0
19 May 2024
Improving Adversarial Transferability of Vision-Language Pre-training
  Models through Collaborative Multimodal Interaction
Improving Adversarial Transferability of Vision-Language Pre-training Models through Collaborative Multimodal Interaction
Jiyuan Fu
Zhaoyu Chen
Kaixun Jiang
Haijing Guo
Jiafeng Wang
Shuyong Gao
Wenqiang Zhang
VLMAAML
81
4
0
16 Mar 2024
Refining Knowledge Transfer on Audio-Image Temporal Agreement for
  Audio-Text Cross Retrieval
Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval
Shunsuke Tsubaki
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
Keisuke Imoto
65
1
0
16 Mar 2024
A Survey of Route Recommendations: Methods, Applications, and
  Opportunities
A Survey of Route Recommendations: Methods, Applications, and Opportunities
Shi-sheng Zhang
Zhipeng Luo
Li Yang
Fei Teng
Tian-Jie Li
AI4TSHAI
63
12
0
01 Mar 2024
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving
Tao Tang
Dafeng Wei
Zhengyu Jia
Tian Gao
Changwei Cai
...
Yixing Zhao
Fu Liu
Xiaodan Liang
Xianpeng Lang
Yang Wang
67
7
0
02 Jan 2024
OT-Attack: Enhancing Adversarial Transferability of Vision-Language
  Models via Optimal Transport Optimization
OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization
Dongchen Han
Xiaojun Jia
Yang Bai
Jindong Gu
Yang Liu
Xiaochun Cao
VLM
86
26
0
07 Dec 2023
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
  Learning
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning
Siyuan Liang
Mingli Zhu
Aishan Liu
Baoyuan Wu
Xiaochun Cao
Ee-Chien Chang
116
58
0
20 Nov 2023
A Survey on Image-text Multimodal Models
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai-Nguyen Nguyen
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
128
7
0
23 Sep 2023
State2Explanation: Concept-Based Explanations to Benefit Agent Learning
  and User Understanding
State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding
Devleena Das
Sonia Chernova
Been Kim
LRMLLMAG
105
24
0
21 Sep 2023
FArMARe: a Furniture-Aware Multi-task methodology for Recommending
  Apartments based on the user interests
FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests
Ali Abdari
Alex Falcon
Giuseppe Serra
61
4
0
06 Sep 2023
Progressive Feature Mining and External Knowledge-Assisted
  Text-Pedestrian Image Retrieval
Progressive Feature Mining and External Knowledge-Assisted Text-Pedestrian Image Retrieval
Huafeng Li
Shedan Yang
Yafei Zhang
Dapeng Tao
Z. Yu
76
3
0
23 Aug 2023
Mitigating Test-Time Bias for Fair Image Retrieval
Mitigating Test-Time Bias for Fair Image Retrieval
Fanjie Kong
Shuai Yuan
Weituo Hao
Ricardo Henao
65
19
0
23 May 2023
RaSa: Relation and Sensitivity Aware Representation Learning for
  Text-based Person Search
RaSa: Relation and Sensitivity Aware Representation Learning for Text-based Person Search
Yang Bai
Ming-Ming Cao
Daming Gao
Ziqiang Cao
Cheng Chen
Zhenfeng Fan
Liqiang Nie
Min Zhang
AI4TS
126
61
0
23 May 2023
Text-based Person Search without Parallel Image-Text Data
Text-based Person Search without Parallel Image-Text Data
Yang Bai
Wenwen Qiang
Min Cao
Cheng Chen
Ziqiang Cao
Liqiang Nie
Min Zhang
90
15
0
22 May 2023
IIITD-20K: Dense captioning for Text-Image ReID
IIITD-20K: Dense captioning for Text-Image ReID
A. V. Subramanyam
N. Sundararajan
Vibhu Dubey
Brejesh Lall
VLM
30
3
0
08 May 2023
Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation
  Incorporating Gloss Information
Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information
Sunjae Kwon
Rishabh Garodia
Minhwa Lee
Zhichao Yang
Hong-ye Yu
CoGe
91
5
0
02 May 2023
Parameter-Efficient Cross-lingual Transfer of Vision and Language Models
  via Translation-based Alignment
Parameter-Efficient Cross-lingual Transfer of Vision and Language Models via Translation-based Alignment
Zhen Zhang
Jialu Wang
Xinze Wang
VLM
94
2
0
02 May 2023
Interpreting Vision and Language Generative Models with Semantic Visual
  Priors
Interpreting Vision and Language Generative Models with Semantic Visual Priors
Michele Cafagna
L. Rojas-Barahona
Kees van Deemter
Albert Gatt
FAttVLM
59
3
0
28 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Vision-Language Models for Vision Tasks: A Survey
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
165
551
0
03 Apr 2023
Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening
Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening
Min Cao
Yang Bai
Wenwen Qiang
Ziqiang Cao
Liqiang Nie
Min Zhang
75
0
0
14 Mar 2023
Selectively Hard Negative Mining for Alleviating Gradient Vanishing in
  Image-Text Matching
Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching
Zheng Li
Caili Guo
Xin Eric Wang
Zerun Feng
Zhongtian Du
VLM
83
4
0
01 Mar 2023
Improving Commonsense in Vision-Language Models via Knowledge Graph
  Riddles
Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles
Shuquan Ye
Yujia Xie
Dongdong Chen
Yichong Xu
Lu Yuan
Chenguang Zhu
Jing Liao
VLM
66
12
0
29 Nov 2022
Data Poisoning Attacks Against Multimodal Encoders
Data Poisoning Attacks Against Multimodal Encoders
Ziqing Yang
Xinlei He
Zheng Li
Michael Backes
Mathias Humbert
Pascal Berrang
Yang Zhang
AAML
176
52
0
30 Sep 2022
Test-time Training for Data-efficient UCDR
Test-time Training for Data-efficient UCDR
Soumava Paul
Titir Dutta
Aheli Saha
Abhishek Samanta
Soma Biswas
OOD
129
0
0
19 Aug 2022
1