Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.14704
Cited By
Cross-Modal Pre-Aligned Method with Global and Local Information for Remote-Sensing Image and Text Retrieval
22 November 2024
Zengbao Sun
Ming Zhao
Gaorui Liu
Andre Kaup
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Cross-Modal Pre-Aligned Method with Global and Local Information for Remote-Sensing Image and Text Retrieval"
23 / 23 papers shown
Title
An Enhanced Large Language Model For Cross Modal Query Understanding System Using DL-KeyBERT Based CAZSSCL-MPGPT
Shreya Singh
95
0
0
24 Feb 2025
Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval
Yuan. Yuan
Yangfan Zhan
Zhitong Xiong
VLM
73
45
0
24 Aug 2023
Global Context Vision Transformers
Ali Hatamizadeh
Hongxu Yin
Greg Heinrich
Jan Kautz
Pavlo Molchanov
ViT
69
127
0
20 Jun 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
77
136
0
21 Apr 2022
Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information
Zhiqiang Yuan
Wenkai Zhang
Changyuan Tian
Xuee Rong
Zhengyuan Zhang
Hongqi Wang
Kun Fu
Xian Sun
68
126
0
21 Apr 2022
ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
Mengjun Cheng
Yipeng Sun
Long Wang
Xiongwei Zhu
Kun Yao
...
Guoli Song
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
93
62
0
31 Mar 2022
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
Hangbo Bao
Wenhui Wang
Li Dong
Qiang Liu
Owais Khan Mohammed
Kriti Aggarwal
Subhojit Som
Furu Wei
VLM
MLLM
MoE
74
558
0
03 Nov 2021
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
Junnan Li
Ramprasaath R. Selvaraju
Akhilesh Deepak Gotmare
Shafiq Joty
Caiming Xiong
Guosheng Lin
FaML
215
1,970
0
16 Jul 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
461
21,439
0
25 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
967
29,731
0
26 Feb 2021
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
Wonjae Kim
Bokyung Son
Ildoo Kim
VLM
CLIP
128
1,749
0
05 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
667
41,369
0
22 Oct 2020
FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval
D. Gao
Linbo Jin
Ben Chen
Minghui Qiu
Peng Li
Yi Wei
Yitao Hu
Haozhe Jasper Wang
OOD
75
133
0
20 May 2020
Visual Semantic Reasoning for Image-Text Matching
Kunpeng Li
Yulun Zhang
Keqin Li
Yuanyuan Li
Y. Fu
VLM
87
505
0
06 Sep 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
247
2,488
0
20 Aug 2019
Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking
Tan Wang
Xing Xu
Yang Yang
Alan Hanjalic
Heng Tao Shen
Jingkuan Song
33
148
0
12 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
234
3,693
0
06 Aug 2019
Understanding urban landuse from the above and ground perspectives: a deep learning, multimodal solution
Shivangi Srivastava
John E. Vargas-Muñoz
D. Tuia
75
138
0
05 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,114
0
11 Oct 2018
Stacked Cross Attention for Image-Text Matching
Kuang-Huei Lee
Xi Chen
G. Hua
Houdong Hu
Xiaodong He
98
1,154
0
21 Mar 2018
Exploring Models and Data for Remote Sensing Image Caption Generation
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
61
475
0
21 Dec 2017
Dual-Path Convolutional Image-Text Embeddings with Instance Loss
Zhedong Zheng
Liang Zheng
Michael Garrett
Yi Yang
Mingliang Xu
Yi-Dong Shen
104
477
0
15 Nov 2017
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Junyoung Chung
Çağlar Gülçehre
Kyunghyun Cho
Yoshua Bengio
593
12,734
0
11 Dec 2014
1