Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.19311
Cited By
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text
25 March 2025
Weizhi Chen
Jingbo Chen
Yupeng Deng
Jiansheng Chen
Yuman Feng
Zhihao Xi
Diyou Liu
Kai Li
Yu Meng
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text"
23 / 23 papers shown
Title
LifeIR at the NTCIR-18 Lifelog-6 Task
Jiahan Chen
Da Li
Keping Bi
5
0
0
27 May 2025
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
Yimiao Zhou
Mengcheng Lan
Xiang Li
Yiping Ke
Yiping Ke
Xue Jiang
Qingyun Li
Xue Yang
Wayne Zhang
ObjD
VLM
157
6
0
16 Nov 2024
TULIP: Token-length Upgraded CLIP
Ivona Najdenkoska
Mohammad Mahdi Derakhshani
Yuki M. Asano
Nanne van Noord
Marcel Worring
Cees G. M. Snoek
VLM
74
4
0
13 Oct 2024
LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Wei Wu
Kecheng Zheng
Shuailei Ma
Fan Lu
Yuxin Guo
Yifei Zhang
Wei Chen
Qingpei Guo
Yujun Shen
Zheng-Jun Zha
VLM
59
9
0
07 Oct 2024
DreamLIP: Language-Image Pre-training with Long Captions
Kecheng Zheng
Yifei Zhang
Wei Wu
Fan Lu
Shuailei Ma
Xin Jin
Wei Chen
Yujun Shen
VLM
CLIP
78
27
0
25 Mar 2024
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Beichen Zhang
Pan Zhang
Xiao-wen Dong
Yuhang Zang
Jiaqi Wang
CLIP
VLM
54
124
0
22 Mar 2024
SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery
Konstantin Klemmer
Esther Rolf
Caleb Robinson
Lester Mackey
M. Rußwurm
SSL
79
72
0
28 Nov 2023
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Kartik Kuckreja
M. S. Danish
Muzammal Naseer
Abhijit Das
Salman Khan
Fahad Shahbaz Khan
55
145
0
24 Nov 2023
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
61
1,028
0
27 Mar 2023
Learning to Evaluate Performance of Multi-modal Semantic Localization
Zhiqiang Yuan
Wenkai Zhang
Chongyang Li
Zhaoying Pan
Yongqiang Mao
Jialiang Chen
Shuoke Li
Hongqi Wang
Xian Sun
41
20
0
14 Sep 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
48
135
0
21 Apr 2022
SLIP: Self-supervision meets Language-Image Pre-training
Norman Mu
Alexander Kirillov
David Wagner
Saining Xie
VLM
CLIP
88
485
0
23 Dec 2021
RegionCLIP: Region-based Language-Image Pretraining
Yiwu Zhong
Jianwei Yang
Pengchuan Zhang
Chunyuan Li
Noel Codella
...
Luowei Zhou
Xiyang Dai
Lu Yuan
Yin Li
Jianfeng Gao
VLM
CLIP
86
568
0
16 Dec 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
188
9,946
0
17 Jun 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
592
28,659
0
26 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
234
40,217
0
22 Oct 2020
Integration of the 3D Environment for UAV Onboard Visual Object Tracking
Stéphane Vujasinović
S. Becker
Timo Breuer
Sebastian Bullinger
N. Scherer-Negenborn
Michael Arens
72
12
0
06 Aug 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
432
41,106
0
28 May 2020
BigEarthNet: A Large-Scale Benchmark Archive For Remote Sensing Image Understanding
Gencer Sumbul
Marcela Charfuelan
Begüm Demir
Volker Markl
61
447
0
16 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
853
93,936
0
11 Oct 2018
Vision Meets Drones: A Challenge
Peng Fei Zhu
Longyin Wen
Xiao Bian
Haibin Ling
Q. Hu
34
398
0
20 Apr 2018
Predicting Ground-Level Scene Layout from Aerial Imagery
Menghua Zhai
Zachary Bessinger
Scott Workman
Nathan Jacobs
79
225
0
08 Dec 2016
FaceNet: A Unified Embedding for Face Recognition and Clustering
Florian Schroff
Dmitry Kalenichenko
James Philbin
3DH
244
13,079
0
12 Mar 2015
1