Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.22081
Cited By
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
28 March 2025
Ziyue Huang
Hongxi Yan
Qiqi Zhan
Shuai Yang
Mingming Zhang
Yiming Lei
Chenkai Zhang
Zeming Liu
Qingjie Liu
Yansen Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Survey on Remote Sensing Foundation Models: From Vision to Multimodality"
50 / 96 papers shown
Title
REOBench: Benchmarking Robustness of Earth Observation Foundation Models
Xiang Li
Yong Tao
Siyuan Zhang
Siwei Liu
Zhitong Xiong
Chunbo Luo
L. J. Liu
Mykola Pechenizkiy
Xiao Xiang Zhu
T. Huang
62
0
0
22 May 2025
A Survey on Data Synthesis and Augmentation for Large Language Models
Ke Wang
Jiahui Zhu
Minjie Ren
Ziqiang Liu
Shiwei Li
...
Yiming Lei
Xiaoyu Wu
Qiqi Zhan
Qingjie Liu
Yunhong Wang
SyDa
165
21
0
16 Oct 2024
RingMo-Aerial: An Aerial Remote Sensing Foundation Model With Affine Transformation Contrastive Learning
Wenhui Diao
Haichen Yu
Kaiyue Kang
Tong Ling
Di Liu
...
Hanbo Bi
Libo Ren
Xuexue Li
Yongqiang Mao
Xian Sun
256
1
0
20 Sep 2024
A Survey of the Self Supervised Learning Mechanisms for Vision Transformers
Asifullah Khan
A. Sohail
Mustansar Fiaz
Mehdi Hassan
Tariq Habib Afridi
...
Muhammad Zaigham Zaheer
Kamran Ali
Tangina Sultana
Ziaurrehman Tanoli
Naeem Akhter
262
5
0
30 Aug 2024
Masked Angle-Aware Autoencoder for Remote Sensing Images
Zhihao Li
B. Hou
Siteng Ma
Zitong Wu
Xianpeng Guo
Bo Ren
Licheng Jiao
119
13
0
04 Aug 2024
AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Shixiong Xu
Chenghao Zhang
Lubin Fan
Gaofeng Meng
Shiming Xiang
Jieping Ye
VLM
88
5
0
11 Jul 2024
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
Di Wang
Meiqi Hu
Yao Jin
Yuchun Miao
Jiaqi Yang
...
Lefei Zhang
Chen Wu
Di Lin
Dacheng Tao
Liangpei Zhang
151
27
0
17 Jun 2024
SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding
Junwei Luo
Zhen Pang
Yongjun Zhang
Tingzhu Wang
Linlin Wang
...
Jiangwei Lao
Jian Wang
Jingdong Chen
Yihua Tan
Yansheng Li
96
27
0
14 Jun 2024
A
2
^{2}
2
-MAE: A spatial-temporal-spectral unified remote sensing pre-training method based on anchor-aware masked autoencoder
Lixian Zhang
Yi Zhao
Runmin Dong
Jinxiao Zhang
Shuai Yuan
...
Weijia Li
Wei Liu
Wayne Zhang
Xue Jiang
Haohuan Fu
101
4
0
12 Jun 2024
MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning
Vishal Nedungadi
A. Kariryaa
Stefan Oehmcke
Serge Belongie
Christian Igel
Nico Lang
103
28
0
04 May 2024
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Beichen Zhang
Pan Zhang
Xiao-wen Dong
Yuhang Zang
Jiaqi Wang
CLIP
VLM
90
141
0
22 Mar 2024
Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery
Mubashir Noman
Muzammal Naseer
Hisham Cholakkal
Rao Muhammad Anwar
Salman Khan
Fahad Shahbaz Khan
ViT
85
45
0
08 Mar 2024
Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery
Wei Zhang
Miaoxin Cai
Tong Zhang
Guoqiang Lei
Zhuang Yin
Xuerui Mao
76
8
0
06 Mar 2024
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Dilxat Muhtar
Zhenshi Li
Feng-Xue Gu
Xue-liang Zhang
Pengfeng Xiao
171
62
0
04 Feb 2024
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models
Haonan Guo
Xin Su
Chen Wu
Bo Du
Lefei Zhang
Deren Li
LLMAG
78
15
0
17 Jan 2024
Generic Knowledge Boosted Pre-training For Remote Sensing Images
Ziyue Huang
Mingming Zhang
Yuan Gong
Qingjie Liu
Yunhong Wang
VLM
68
15
0
09 Jan 2024
Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image Captioning
Cong Yang
Zuchao Li
Lefei Zhang
62
25
0
02 Dec 2023
SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery
Konstantin Klemmer
Esther Rolf
Caleb Robinson
Lester Mackey
M. Rußwurm
SSL
122
78
0
28 Nov 2023
Predicting Gradient is Better: Exploring Self-Supervised Learning for SAR ATR with a Joint-Embedding Predictive Architecture
Wei-Jang Li
Yang Wei
Tianpeng Liu
Yuenan Hou
Yuxuan Li
Zhen Liu
Yongxiang Liu
Li Liu
93
19
0
26 Nov 2023
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Kartik Kuckreja
M. S. Danish
Muzammal Naseer
Abhijit Das
Salman Khan
Fahad Shahbaz Khan
93
154
0
24 Nov 2023
Tree-GPT: Modular Large Language Model Expert System for Forest Remote Sensing Image Understanding and Interactive Analysis
Siqi Du
Shengjun Tang
Weixi Wang
Xiaoming Li
Renzhong Guo
108
9
0
07 Oct 2023
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Avamarie Brueggeman
Andrea Madotto
Zhaojiang Lin
Tushar Nagarajan
Matt Smith
...
Peyman Heidari
Yue Liu
Kavya Srinet
Babak Damavandi
Anuj Kumar
MLLM
82
94
0
27 Sep 2023
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
V. Cepeda
Gaurav Kumar Nayak
Mubarak Shah
78
102
0
27 Sep 2023
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework
Yuelei Wang
Ting Zhang
Liangjin Zhao
Lin Hu
Zhechao Wang
...
Kaiqiang Chen
Xuan Zeng
Zhirui Wang
Hongqi Wang
Xian Sun
82
5
0
16 Sep 2023
The Rise and Potential of Large Language Model Based Agents: A Survey
Zhiheng Xi
Wenxiang Chen
Xin Guo
Wei He
Yiwen Ding
...
Wenjuan Qin
Yongyan Zheng
Xipeng Qiu
Xuanjing Huan
Tao Gui
LM&MA
LM&Ro
3DV
AI4CE
135
956
0
14 Sep 2023
Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval
Yuan. Yuan
Yangfan Zhan
Zhitong Xiong
VLM
80
47
0
24 Aug 2023
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages
Jinyi Hu
Yuan Yao
Chong Wang
Shanonan Wang
Yinxu Pan
...
Yankai Lin
Jiao Xue
Dahai Li
Zhiyuan Liu
Maosong Sun
MLLM
VLM
97
55
0
23 Aug 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
131
122
0
18 May 2023
What Do Self-Supervised Vision Transformers Learn?
Namuk Park
Wonjae Kim
Byeongho Heo
Taekyung Kim
Sangdoo Yun
SSL
176
80
1
01 May 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
573
4,925
0
17 Apr 2023
Emergent autonomous scientific research capabilities of large language models
Daniil A. Boiko
R. MacKnight
Gabe Gomes
ELM
LM&Ro
AI4CE
LLMAG
160
127
0
11 Apr 2023
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
Ge Li
Hasan Hammoud
Hani Itani
Dmitrii Khizbullin
Guohao Li
SyDa
ALM
133
513
0
31 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
151
513
0
27 Mar 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.5K
14,761
0
15 Mar 2023
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Hao Zhang
...
Chun-yue Li
Jianwei Yang
Hang Su
Jun Zhu
Lei Zhang
ObjD
195
2,028
0
09 Mar 2023
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
184
4,180
1
10 Feb 2023
Towards Geospatial Foundation Models via Continual Pretraining
Matías Mendieta
Boran Han
Xingjian Shi
Yi Zhu
Chen Chen
VLM
AI4CE
117
73
0
09 Feb 2023
Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization
Lukas Haas
Silas Alberti
Michal Skreta
VLM
80
24
0
01 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
432
4,663
0
30 Jan 2023
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
71
321
0
12 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
Lefei Zhang
71
256
0
08 Aug 2022
Towards Large-Scale Small Object Detection: Survey and Benchmarks
Gong Cheng
Xiang Yuan
Xiwen Yao
Ke Yan
Qinghua Zeng
Xingxing Xie
Junwei Han
ObjD
98
333
0
28 Jul 2022
Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain
Tong Zhang
Peng Gao
Hao-Chen Dong
Zhuang Yin
Guanqun Wang
Wei Zhang
He Chen
70
34
0
08 Jul 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
186
1,309
0
04 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
418
3,610
0
29 Apr 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
85
138
0
21 Apr 2022
TOV: The Original Vision Model for Optical Remote Sensing Image Understanding via Self-supervised Learning
Chao Tao
Ji Qi
Guo Zhang
Qing Zhu
Weipeng Lu
Haifeng Li
104
43
0
10 Apr 2022
DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation
Aysim Toker
L. Kondmann
Mark Weber
Marvin Eisenberger
Andrés Camero
...
T. Davis
Daniel Cremers
G. Marchisio
Xiao Xiang Zhu
Laura Leal-Taixé
56
85
0
23 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
859
9,714
0
28 Jan 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
191
5,226
0
10 Jan 2022
1
2
Next