Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.15969
Cited By
v1
v2 (latest)
Beyond the Visible: Multispectral Vision-Language Learning for Earth Observation
20 March 2025
Clive Tinashe Marimo
Benedikt Blumenstiel
Maximilian Nitsche
Johannes Jakubik
Thomas Brunschwiler
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Beyond the Visible: Multispectral Vision-Language Learning for Earth Observation"
21 / 21 papers shown
Title
TerraMind: Large-Scale Generative Multimodality for Earth Observation
Johannes Jakubik
Felix Yang
Benedikt Blumenstiel
Erik Scheurer
Rocco Sedona
...
P. Fraccaro
Thomas Brunschwiler
Gabriele Cavallaro
Juan Bernabé-Moreno
Nicolas Longépé
MLLM
VLM
115
6
0
15 Apr 2025
Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications
Daniela Szwarcman
Sujit Roy
P. Fraccaro
Þorsteinn Elí Gíslason
Benedikt Blumenstiel
...
Rahul Ramachandran
Juan Bernabé-Moreno
Manil Maskey
Rahul Ramachandran
Juan Bernabe Moreno
VLM
132
19
0
03 Dec 2024
ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models
Zhenghang Yuan
Zhitong Xiong
Lichao Mou
Xiao Xiang Zhu
88
9
0
17 Feb 2024
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing
Zhecheng Wang
R. Prabha
Tianyuan Huang
Jiajun Wu
Ram Rajagopal
78
66
0
20 Dec 2023
Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment
Utkarsh Mall
Cheng Perng Phoo
Meilin Kelsey Liu
Carl Vondrick
B. Hariharan
Kavita Bala
VLM
72
42
0
12 Dec 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
Benedikt Blumenstiel
Johannes Jakubik
Hilde Kuhne
Michael Vossing
VLM
107
18
0
27 Jun 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
Zilun Zhang
Tiancheng Zhao
Yulong Guo
Yuxiang Cai
DiffM
VLM
127
66
0
20 Jun 2023
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
Fan Liu
Delong Chen
Zhan-Rong Guan
Xiaocong Zhou
Jiale Zhu
Qiaolin Ye
Liyong Fu
Jun Zhou
VLM
152
224
0
19 Jun 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
437
4,663
0
30 Jan 2023
Reproducible scaling laws for contrastive language-image learning
Mehdi Cherti
Romain Beaumont
Ross Wightman
Mitchell Wortsman
Gabriel Ilharco
Cade Gordon
Christoph Schuhmann
Ludwig Schmidt
J. Jitsev
VLM
CLIP
137
821
0
14 Dec 2022
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
206
3,507
0
16 Oct 2022
METER-ML: A Multi-Sensor Earth Observation Benchmark for Automated Methane Source Mapping
B. Zhu
Nicholas Lui
Jeremy Irvin
Jimmy Le
Sahil Tadwalkar
Chenghao Wang
Zutao Ouyang
Frankie Y. Liu
Andrew Y. Ng
R. Jackson
91
16
0
22 Jul 2022
SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery
Yezhen Cong
Samarth Khanna
Chenlin Meng
Patrick Liu
Erik Rozi
Yutong He
Marshall Burke
David B. Lobell
Stefano Ermon
ViT
99
276
0
17 Jul 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
88
138
0
21 Apr 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
557
4,421
0
28 Jan 2022
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
1.0K
29,944
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
482
3,906
0
11 Feb 2021
BigEarthNet: A Large-Scale Benchmark Archive For Remote Sensing Image Understanding
Gencer Sumbul
Marcela Charfuelan
Begüm Demir
Volker Markl
101
455
0
16 Feb 2019
Exploring Models and Data for Remote Sensing Image Caption Generation
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
63
477
0
21 Dec 2017
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification
P. Helber
B. Bischke
Andreas Dengel
Damian Borth
158
1,834
0
31 Aug 2017
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
108
2,269
0
01 Mar 2017
1