Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.13814
Cited By
FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification
18 March 2025
Jiadong Wang
Weiwei Song
Hao Chen
Jie Ren
Huimin Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification"
8 / 8 papers shown
Title
Image Fusion via Vision-Language Model
Zixiang Zhao
Lilun Deng
Haowen Bai
Yukun Cui
Zhipeng Zhang
...
Haotong Qin
Dongdong Chen
Jiangshe Zhang
Peng Wang
Luc Van Gool
VLM
53
21
0
03 Feb 2024
Nearest Neighbor-Based Contrastive Learning for Hyperspectral and LiDAR Data Classification
Meng Wang
Feng Gao
Junyu Dong
Hengchao Li
Q. Du
SSL
53
71
0
09 Jan 2023
CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning
Shi-You Xu
VLM
DiffM
63
12
0
10 Oct 2022
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
760
28,659
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
407
3,778
0
11 Feb 2021
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
124
2,174
0
23 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
460
40,217
0
22 Oct 2020
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
120
1,939
0
09 Aug 2019
1