Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.05506
Cited By
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
12 September 2019
Zihao Wang
Xihui Liu
Hongsheng Li
Lu Sheng
Junjie Yan
Xiaogang Wang
Jing Shao
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval"
50 / 110 papers shown
Title
Scale-Semantic Joint Decoupling Network for Image-text Retrieval in Remote Sensing
Chengyu Zheng
Ning Song
Ruoyu Zhang
Lei Huang
Zhiqiang Wei
Jie Nie
58
16
0
12 Dec 2022
Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
En Yu
Songtao Liu
Zhuoling Li
Jinrong Yang
Zeming Li
Shoudong Han
Wenbing Tao
110
13
0
03 Dec 2022
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim
Nam-Won Kim
Suha Kwak
118
40
0
30 Nov 2022
Vis2Mus: Exploring Multimodal Representation Mapping for Controllable Music Generation
Runbang Zhang
Yixiao Zhang
Kai Shao
Ying Shan
Gus Xia
64
4
0
10 Nov 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
105
19
0
05 Oct 2022
Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval
Zheng Li
Caili Guo
Xin Eric Wang
Zerun Feng
Lei Li
Zhongtian Du
VLM
70
2
0
28 Sep 2022
Exploring Visual Interpretability for Contrastive Language-Image Pre-training
Yi Li
Hualiang Wang
Yiqun Duan
Han Xu
Xiaomeng Li
CLIP
VLM
153
27
0
15 Sep 2022
Learning to Evaluate Performance of Multi-modal Semantic Localization
Zhiqiang Yuan
Wenkai Zhang
Chongyang Li
Zhaoying Pan
Yongqiang Mao
Jialiang Chen
Shuoke Li
Hongqi Wang
Xian Sun
99
20
0
14 Sep 2022
CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding
Xingyu Bruce Liu
Ruolin Wang
Dingzeyu Li
Xiang Ánthony' Chen
Amy Pavel
101
27
0
23 Aug 2022
Understanding Attention for Vision-and-Language Tasks
Feiqi Cao
S. Han
Siqu Long
Changwei Xu
Josiah Poon
77
5
0
17 Aug 2022
Discover and Mitigate Unknown Biases with Debiasing Alternate Networks
Zhiheng Li
A. Hoogs
Chenliang Xu
83
55
0
20 Jul 2022
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
VLM
76
68
0
17 Jul 2022
Dynamic Contrastive Distillation for Image-Text Retrieval
Jun Rao
Liang Ding
Shuhan Qi
Meng Fang
Yang Liu
Liqiong Shen
Dacheng Tao
VLM
112
32
0
04 Jul 2022
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval
Feilong Chen
Xiuyi Chen
Jiaxin Shi
Duzhen Zhang
Jianlong Chang
Qi Tian
VLM
CLIP
93
6
0
24 May 2022
Reducing Predictive Feature Suppression in Resource-Constrained Contrastive Image-Caption Retrieval
Maurits J. R. Bleeker
Andrew Yates
Maarten de Rijke
91
4
0
28 Apr 2022
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
94
30
0
25 Apr 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
99
139
0
21 Apr 2022
Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information
Zhiqiang Yuan
Wenkai Zhang
Changyuan Tian
Xuee Rong
Zhengyuan Zhang
Hongqi Wang
Kun Fu
Xian Sun
94
130
0
21 Apr 2022
Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations
Leila Pishdad
Ran Zhang
Konstantinos G. Derpanis
Allan D. Jepson
Afsaneh Fazly
41
2
0
20 Apr 2022
Partially Does It: Towards Scene-Level FG-SBIR with Partial Input
Pinaki Nath Chowdhury
A. Bhunia
Viswanatha Reddy Gajjala
Aneeshan Sain
Tao Xiang
Yi-Zhe Song
127
21
0
28 Mar 2022
Image-text Retrieval: A Survey on Recent Research and Development
Min Cao
Shiping Li
Juntao Li
Liqiang Nie
Min Zhang
97
85
0
28 Mar 2022
Revitalize Region Feature for Democratizing Video-Language Pre-training of Retrieval
Guanyu Cai
Yixiao Ge
Binjie Zhang
Alex Jinpeng Wang
Rui Yan
...
Ying Shan
Lianghua He
Xiaohu Qie
Jianping Wu
Mike Zheng Shou
VLM
46
6
0
15 Mar 2022
Two-stream Hierarchical Similarity Reasoning for Image-text Matching
Ran Chen
Hanli Wang
Lei Wang
Sam Kwong
57
9
0
10 Mar 2022
Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval
Jun Rao
Fei Wang
Liang Ding
Shuhan Qi
Yibing Zhan
Weifeng Liu
Dacheng Tao
OOD
89
30
0
08 Mar 2022
Multi-Modal Knowledge Graph Construction and Application: A Survey
Xiangru Zhu
Zhixu Li
Xiaodan Wang
Xueyao Jiang
Penglei Sun
Xuwu Wang
Yanghua Xiao
N. Yuan
73
167
0
11 Feb 2022
Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics
Hangjie Yuan
Mang Wang
Dong Ni
Liangpeng Xu
89
40
0
01 Feb 2022
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval
Zhixiong Zeng
Wenji Mao
VLM
52
18
0
08 Jan 2022
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLM
CLIP
224
582
0
02 Dec 2021
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval
Zhihao Fan
Zhongyu Wei
Zejun Li
Siyuan Wang
Jianqing Fan
53
7
0
05 Nov 2021
Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
237
18
0
06 Oct 2021
FooDI-ML: a large multi-language dataset of food, drinks and groceries images and descriptions
David Amat Olóndriz
Ponç Puigdevall
A. S. Palau
VLM
97
7
0
05 Oct 2021
An animated picture says at least a thousand words: Selecting Gif-based Replies in Multimodal Dialog
Xingyao Wang
David Jurgens
59
5
0
24 Sep 2021
Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval
Zhihao Fan
Zhongyu Wei
Zejun Li
Siyuan Wang
Haijun Shan
Xuanjing Huang
Jianqing Fan
CLIP
45
12
0
12 Sep 2021
Semantically Self-Aligned Network for Text-to-Image Part-aware Person Re-identification
Z. Ding
Changxing Ding
Zhiyin Shao
Dacheng Tao
115
138
0
27 Jul 2021
Step-Wise Hierarchical Alignment Network for Image-Text Matching
Zhong Ji
Kexin Chen
Haoran Wang
80
94
0
11 Jun 2021
Learning Relation Alignment for Calibrated Cross-modal Retrieval
Shuhuai Ren
Junyang Lin
Guangxiang Zhao
Rui Men
An Yang
Jingren Zhou
Xu Sun
Hongxia Yang
77
38
0
28 May 2021
Cross-Modal Generative Augmentation for Visual Question Answering
Zixu Wang
Yishu Miao
Lucia Specia
75
11
0
11 May 2021
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Shir Gur
Natalia Neverova
C. Stauffer
Ser-Nam Lim
Douwe Kiela
A. Reiter
147
30
0
16 Apr 2021
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Rui Zhao
Kecheng Zheng
Zhengjun Zha
Hongtao Xie
Jiebo Luo
40
3
0
29 Mar 2021
An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information
Zejun Li
Zhongyu Wei
Zhihao Fan
Haijun Shan
Xuanjing Huang
45
5
0
21 Mar 2021
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
Siqi Sun
Yen-Chun Chen
Linjie Li
Shuohang Wang
Yuwei Fang
Jingjing Liu
VLM
89
84
0
16 Mar 2021
Telling the What while Pointing to the Where: Multimodal Queries for Image Retrieval
Soravit Changpinyo
Jordi Pont-Tuset
V. Ferrari
Radu Soricut
66
26
0
09 Feb 2021
Probabilistic Embeddings for Cross-Modal Retrieval
Sanghyuk Chun
Seong Joon Oh
Rafael Sampaio de Rezende
Yannis Kalantidis
Diane Larlus
UQCV
521
210
0
13 Jan 2021
Similarity Reasoning and Filtration for Image-Text Matching
Haiwen Diao
Ying Zhang
Lingyun Ma
Huchuan Lu
307
347
0
05 Jan 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
340
157
0
02 Jan 2021
VisualSparta: An Embarrassingly Simple Approach to Large-scale Text-to-Image Search with Weighted Bag-of-words
Xiaopeng Lu
Tiancheng Zhao
Kyusong Lee
71
27
0
01 Jan 2021
ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora
Ouyang Xuan
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
140
102
0
31 Dec 2020
Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision
Yujie Zhong
Linhai Xie
Sen Wang
Lucia Specia
Yishu Miao
SSL
26
0
0
19 Nov 2020
Structured Visual Search via Composition-aware Learning
Mert Kilickaya
A. Smeulders
CoGe
62
5
0
27 Oct 2020
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
101
6
0
19 Oct 2020
Previous
1
2
3
Next