ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.05506
  4. Cited By
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval

CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval

12 September 2019
Zihao Wang
Xihui Liu
Hongsheng Li
Lu Sheng
Junjie Yan
Xiaogang Wang
Jing Shao
    VLM
ArXiv (abs)PDFHTML

Papers citing "CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval"

50 / 110 papers shown
Title
Scale-Semantic Joint Decoupling Network for Image-text Retrieval in
  Remote Sensing
Scale-Semantic Joint Decoupling Network for Image-text Retrieval in Remote Sensing
Chengyu Zheng
Ning Song
Ruoyu Zhang
Lei Huang
Zhiqiang Wei
Jie Nie
58
16
0
12 Dec 2022
Generalizing Multiple Object Tracking to Unseen Domains by Introducing
  Natural Language Representation
Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
En Yu
Songtao Liu
Zhuoling Li
Jinrong Yang
Zeming Li
Shoudong Han
Wenbing Tao
110
13
0
03 Dec 2022
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Improving Cross-Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim
Nam-Won Kim
Suha Kwak
118
40
0
30 Nov 2022
Vis2Mus: Exploring Multimodal Representation Mapping for Controllable
  Music Generation
Vis2Mus: Exploring Multimodal Representation Mapping for Controllable Music Generation
Runbang Zhang
Yixiao Zhang
Kai Shao
Ying Shan
Gus Xia
64
4
0
10 Nov 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
105
19
0
05 Oct 2022
Unified Loss of Pair Similarity Optimization for Vision-Language
  Retrieval
Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval
Zheng Li
Caili Guo
Xin Eric Wang
Zerun Feng
Lei Li
Zhongtian Du
VLM
70
2
0
28 Sep 2022
Exploring Visual Interpretability for Contrastive Language-Image
  Pre-training
Exploring Visual Interpretability for Contrastive Language-Image Pre-training
Yi Li
Hualiang Wang
Yiqun Duan
Han Xu
Xiaomeng Li
CLIPVLM
153
27
0
15 Sep 2022
Learning to Evaluate Performance of Multi-modal Semantic Localization
Learning to Evaluate Performance of Multi-modal Semantic Localization
Zhiqiang Yuan
Wenkai Zhang
Chongyang Li
Zhaoying Pan
Yongqiang Mao
Jialiang Chen
Shuoke Li
Hongqi Wang
Xian Sun
99
20
0
14 Sep 2022
CrossA11y: Identifying Video Accessibility Issues via Cross-modal
  Grounding
CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding
Xingyu Bruce Liu
Ruolin Wang
Dingzeyu Li
Xiang Ánthony' Chen
Amy Pavel
101
27
0
23 Aug 2022
Understanding Attention for Vision-and-Language Tasks
Understanding Attention for Vision-and-Language Tasks
Feiqi Cao
S. Han
Siqu Long
Changwei Xu
Josiah Poon
77
5
0
17 Aug 2022
Discover and Mitigate Unknown Biases with Debiasing Alternate Networks
Discover and Mitigate Unknown Biases with Debiasing Alternate Networks
Zhiheng Li
A. Hoogs
Chenliang Xu
83
55
0
20 Jul 2022
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
VLM
76
68
0
17 Jul 2022
Dynamic Contrastive Distillation for Image-Text Retrieval
Dynamic Contrastive Distillation for Image-Text Retrieval
Jun Rao
Liang Ding
Shuhan Qi
Meng Fang
Yang Liu
Liqiong Shen
Dacheng Tao
VLM
112
32
0
04 Jul 2022
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text
  Retrieval
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval
Feilong Chen
Xiuyi Chen
Jiaxin Shi
Duzhen Zhang
Jianlong Chang
Qi Tian
VLMCLIP
93
6
0
24 May 2022
Reducing Predictive Feature Suppression in Resource-Constrained
  Contrastive Image-Caption Retrieval
Reducing Predictive Feature Suppression in Resource-Constrained Contrastive Image-Caption Retrieval
Maurits J. R. Bleeker
Andrew Yates
Maarten de Rijke
91
4
0
28 Apr 2022
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo
  and Text
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
94
30
0
25 Apr 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote
  Sensing Image Retrieval
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
99
139
0
21 Apr 2022
Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and
  Local Information
Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information
Zhiqiang Yuan
Wenkai Zhang
Changyuan Tian
Xuee Rong
Zhengyuan Zhang
Hongqi Wang
Kun Fu
Xian Sun
94
130
0
21 Apr 2022
Uncertainty-based Cross-Modal Retrieval with Probabilistic
  Representations
Uncertainty-based Cross-Modal Retrieval with Probabilistic Representations
Leila Pishdad
Ran Zhang
Konstantinos G. Derpanis
Allan D. Jepson
Afsaneh Fazly
41
2
0
20 Apr 2022
Partially Does It: Towards Scene-Level FG-SBIR with Partial Input
Partially Does It: Towards Scene-Level FG-SBIR with Partial Input
Pinaki Nath Chowdhury
A. Bhunia
Viswanatha Reddy Gajjala
Aneeshan Sain
Tao Xiang
Yi-Zhe Song
127
21
0
28 Mar 2022
Image-text Retrieval: A Survey on Recent Research and Development
Image-text Retrieval: A Survey on Recent Research and Development
Min Cao
Shiping Li
Juntao Li
Liqiang Nie
Min Zhang
97
85
0
28 Mar 2022
Revitalize Region Feature for Democratizing Video-Language Pre-training
  of Retrieval
Revitalize Region Feature for Democratizing Video-Language Pre-training of Retrieval
Guanyu Cai
Yixiao Ge
Binjie Zhang
Alex Jinpeng Wang
Rui Yan
...
Ying Shan
Lianghua He
Xiaohu Qie
Jianping Wu
Mike Zheng Shou
VLM
46
6
0
15 Mar 2022
Two-stream Hierarchical Similarity Reasoning for Image-text Matching
Two-stream Hierarchical Similarity Reasoning for Image-text Matching
Ran Chen
Hanli Wang
Lei Wang
Sam Kwong
57
9
0
10 Mar 2022
Where Does the Performance Improvement Come From? -- A Reproducibility
  Concern about Image-Text Retrieval
Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval
Jun Rao
Fei Wang
Liang Ding
Shuhan Qi
Yibing Zhan
Weifeng Liu
Dacheng Tao
OOD
89
30
0
08 Mar 2022
Multi-Modal Knowledge Graph Construction and Application: A Survey
Multi-Modal Knowledge Graph Construction and Application: A Survey
Xiangru Zhu
Zhixu Li
Xiaodan Wang
Xueyao Jiang
Penglei Sun
Xuwu Wang
Yanghua Xiao
N. Yuan
73
167
0
11 Feb 2022
Detecting Human-Object Interactions with Object-Guided Cross-Modal
  Calibrated Semantics
Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics
Hangjie Yuan
Mang Wang
Dong Ni
Liangpeng Xu
89
40
0
01 Feb 2022
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for
  Supervised Cross-Modal Retrieval
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval
Zhixiong Zeng
Wenji Mao
VLM
52
18
0
08 Jan 2022
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLMCLIP
224
582
0
02 Dec 2021
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences
  for Image-Text Retrieval
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval
Zhihao Fan
Zhongyu Wei
Zejun Li
Siyuan Wang
Jianqing Fan
53
7
0
05 Nov 2021
Is An Image Worth Five Sentences? A New Look into Semantics for
  Image-Text Matching
Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
237
18
0
06 Oct 2021
FooDI-ML: a large multi-language dataset of food, drinks and groceries
  images and descriptions
FooDI-ML: a large multi-language dataset of food, drinks and groceries images and descriptions
David Amat Olóndriz
Ponç Puigdevall
A. S. Palau
VLM
97
7
0
05 Oct 2021
An animated picture says at least a thousand words: Selecting Gif-based
  Replies in Multimodal Dialog
An animated picture says at least a thousand words: Selecting Gif-based Replies in Multimodal Dialog
Xingyao Wang
David Jurgens
59
5
0
24 Sep 2021
Constructing Phrase-level Semantic Labels to Form Multi-Grained
  Supervision for Image-Text Retrieval
Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval
Zhihao Fan
Zhongyu Wei
Zejun Li
Siyuan Wang
Haijun Shan
Xuanjing Huang
Jianqing Fan
CLIP
45
12
0
12 Sep 2021
Semantically Self-Aligned Network for Text-to-Image Part-aware Person
  Re-identification
Semantically Self-Aligned Network for Text-to-Image Part-aware Person Re-identification
Z. Ding
Changxing Ding
Zhiyin Shao
Dacheng Tao
115
138
0
27 Jul 2021
Step-Wise Hierarchical Alignment Network for Image-Text Matching
Step-Wise Hierarchical Alignment Network for Image-Text Matching
Zhong Ji
Kexin Chen
Haoran Wang
80
94
0
11 Jun 2021
Learning Relation Alignment for Calibrated Cross-modal Retrieval
Learning Relation Alignment for Calibrated Cross-modal Retrieval
Shuhuai Ren
Junyang Lin
Guangxiang Zhao
Rui Men
An Yang
Jingren Zhou
Xu Sun
Hongxia Yang
77
38
0
28 May 2021
Cross-Modal Generative Augmentation for Visual Question Answering
Cross-Modal Generative Augmentation for Visual Question Answering
Zixu Wang
Yishu Miao
Lucia Specia
75
11
0
11 May 2021
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Shir Gur
Natalia Neverova
C. Stauffer
Ser-Nam Lim
Douwe Kiela
A. Reiter
147
30
0
16 Apr 2021
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Rui Zhao
Kecheng Zheng
Zhengjun Zha
Hongtao Xie
Jiebo Luo
40
3
0
29 Mar 2021
An Unsupervised Sampling Approach for Image-Sentence Matching Using
  Document-Level Structural Information
An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information
Zejun Li
Zhongyu Wei
Zhihao Fan
Haijun Shan
Xuanjing Huang
45
5
0
21 Mar 2021
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time
  Image-Text Retrieval
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
Siqi Sun
Yen-Chun Chen
Linjie Li
Shuohang Wang
Yuwei Fang
Jingjing Liu
VLM
89
84
0
16 Mar 2021
Telling the What while Pointing to the Where: Multimodal Queries for
  Image Retrieval
Telling the What while Pointing to the Where: Multimodal Queries for Image Retrieval
Soravit Changpinyo
Jordi Pont-Tuset
V. Ferrari
Radu Soricut
66
26
0
09 Feb 2021
Probabilistic Embeddings for Cross-Modal Retrieval
Probabilistic Embeddings for Cross-Modal Retrieval
Sanghyuk Chun
Seong Joon Oh
Rafael Sampaio de Rezende
Yannis Kalantidis
Diane Larlus
UQCV
521
210
0
13 Jan 2021
Similarity Reasoning and Filtration for Image-Text Matching
Similarity Reasoning and Filtration for Image-Text Matching
Haiwen Diao
Ying Zhang
Lingyun Ma
Huchuan Lu
307
347
0
05 Jan 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjDVLM
340
157
0
02 Jan 2021
VisualSparta: An Embarrassingly Simple Approach to Large-scale
  Text-to-Image Search with Weighted Bag-of-words
VisualSparta: An Embarrassingly Simple Approach to Large-scale Text-to-Image Search with Weighted Bag-of-words
Xiaopeng Lu
Tiancheng Zhao
Kyusong Lee
71
27
0
01 Jan 2021
ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual
  Semantics with Monolingual Corpora
ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora
Ouyang Xuan
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
140
102
0
31 Dec 2020
Watch and Learn: Mapping Language and Noisy Real-world Videos with
  Self-supervision
Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision
Yujie Zhong
Linhai Xie
Sen Wang
Lucia Specia
Yishu Miao
SSL
26
0
0
19 Nov 2020
Structured Visual Search via Composition-aware Learning
Structured Visual Search via Composition-aware Learning
Mert Kilickaya
A. Smeulders
CoGe
62
5
0
27 Oct 2020
Multimodal Research in Vision and Language: A Review of Current and
  Emerging Trends
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
101
6
0
19 Oct 2020
Previous
123
Next