Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.07048
Cited By
Pushing the Limits of Vision-Language Models in Remote Sensing without Human Annotations
11 September 2024
Keumgang Cha
Donggeun Yu
Junghoon Seo
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pushing the Limits of Vision-Language Models in Remote Sensing without Human Annotations"
24 / 24 papers shown
Title
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLM
VLM
121
2,067
0
11 May 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
426
4,563
0
30 Jan 2023
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
194
3,482
0
16 Oct 2022
Learning to Evaluate Performance of Multi-modal Semantic Localization
Zhiqiang Yuan
Wenkai Zhang
Chongyang Li
Zhaoying Pan
Yongqiang Mao
Jialiang Chen
Shuoke Li
Hongqi Wang
Xian Sun
68
20
0
14 Sep 2022
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang
Hangbo Bao
Li Dong
Johan Bjorck
Zhiliang Peng
...
Kriti Aggarwal
O. Mohammed
Saksham Singhal
Subhojit Som
Furu Wei
MLLM
VLM
ViT
143
644
0
22 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
Lefei Zhang
66
255
0
08 Aug 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
416
3,585
0
29 Apr 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
77
136
0
21 Apr 2022
Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information
Zhiqiang Yuan
Wenkai Zhang
Changyuan Tian
Xuee Rong
Zhengyuan Zhang
Hongqi Wang
Kun Fu
Xian Sun
68
126
0
21 Apr 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
500
6,279
0
05 Apr 2022
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
694
6,079
0
29 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
964
29,731
0
26 Feb 2021
High-resolution land cover change from low-resolution labels: Simple baselines for the 2021 IEEE GRSS Data Fusion Contest
Nikolay Malkin
Caleb Robinson
Nebojsa Jojic
26
5
0
04 Jan 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
664
41,103
0
22 Oct 2020
MLRSNet: A Multi-label High Spatial Resolution Remote Sensing Dataset for Semantic Scene Understanding
Xiaoman Qi
P. Zhu
Yuebin Wang
Liqiang Zhang
Junhuan Peng
Mengfan Wu
Jialong Chen
Xudong Zhao
Ning Zang
P. Mathiopoulos
89
116
0
01 Oct 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
820
42,055
0
28 May 2020
Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark
Ke Li
G. Wan
Gong Cheng
L. Meng
Junwei Han
64
1,449
0
31 Aug 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,114
0
11 Oct 2018
Exploring Models and Data for Remote Sensing Image Caption Generation
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
61
471
0
21 Dec 2017
Functional Map of the World
Gordon A. Christie
Neil Fendley
James Wilson
R. Mukherjee
VGen
73
393
0
21 Nov 2017
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification
P. Helber
B. Bischke
Andreas Dengel
Damian Borth
137
1,820
0
31 Aug 2017
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
106
2,262
0
01 Mar 2017
Billion-scale similarity search with GPUs
Jeff Johnson
Matthijs Douze
Hervé Jégou
257
3,723
0
28 Feb 2017
AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification
Gui-Song Xia
Jingwen Hu
Fan Hu
Baoguang Shi
X. Bai
Yanfei Zhong
Liangpei Zhang
80
1,723
0
18 Aug 2016
1