Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.13861
Cited By
Where in the World is this Image? Transformer-based Geo-localization in the Wild
29 April 2022
Shraman Pramanick
E. Nowara
Joshua Gleason
Carlos D. Castillo
Rama Chellappa
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Where in the World is this Image? Transformer-based Geo-localization in the Wild"
24 / 24 papers shown
Title
LocDiffusion: Identifying Locations on Earth by Diffusing in the Hilbert Space
Zhangyu Wang
Jielu Zhang
Zhongliang Zhou
Qian Cao
Nemin Wu
...
Lan Mu
Yang Song
Yiqun Xie
Ni Lao
Gengchen Mai
DiffM
34
0
0
23 Mar 2025
NAVIG: Natural Language-guided Analysis with Vision Language Models for Image Geo-localization
Zheyuan Zhang
Runze Li
Tasnim Kabir
Jordan Boyd-Graber
55
0
0
21 Feb 2025
Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework
Zirui Song
Jingpu Yang
Yuan Huang
Jonathan Tonglet
Zeyu Zhang
Tao Cheng
Meng Fang
Iryna Gurevych
X. Chen
LRM
65
1
0
19 Feb 2025
CityGuessr: City-Level Video Geo-Localization on a Global Scale
P. Kulkarni
Gaurav Kumar Nayak
Mubarak Shah
ViT
AI4TS
29
2
0
10 Nov 2024
Statewide Visual Geolocalization in the Wild
F. Fervers
Sebastian Bullinger
C. Bodensteiner
Michael Arens
Rainer Stiefelhagen
23
2
0
25 Sep 2024
Enhancing Worldwide Image Geolocation by Ensembling Satellite-Based Ground-Level Attribute Predictors
Michael J. Bianco
David Eigen
Michael Gormish
22
0
0
18 Jul 2024
AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Shixiong Xu
Chenghao Zhang
Lubin Fan
Gaofeng Meng
Shiming Xiang
Jieping Ye
VLM
41
4
0
11 Jul 2024
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization
Anindya Sarkar
S. Sastry
Aleksis Pirinen
Chongjie Zhang
Nathan Jacobs
Yevgeniy Vorobeychik
49
4
0
04 Jun 2024
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model
Ling Li
Yu Ye
Bingchuan Jiang
Wei Zeng
VLM
LRM
31
7
0
03 Jun 2024
G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models
Pengyue Jia
Yiding Liu
Xiaopeng Li
Xiangyu Zhao
Yuhao Wang
Yantong Du
Xiao Han
Xuetao Wei
Shuaiqiang Wang
Dawei Yin
DiffM
34
6
0
23 May 2024
OpenStreetView-5M: The Many Roads to Global Visual Geolocation
Guillaume Astruc
Nicolas Dufour
Ioannis Siglidis
Constantin Aronssohn
Nacim Bouia
...
Charles Raude
Elliot Vincent
Lintao Xu
Hongyu Zhou
Loic Landrieu
27
6
0
29 Apr 2024
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation
Zhongliang Zhou
Jielu Zhang
Zihan Guan
Mengxuan Hu
Ni Lao
Lan Mu
Sheng R. Li
Gengchen Mai
VLM
43
12
0
28 Mar 2024
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
V. Cepeda
Gaurav Kumar Nayak
Mubarak Shah
13
83
0
27 Sep 2023
Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
Brandon Clark
Alec Kerrigan
P. Kulkarni
V. Cepeda
M. Shah
19
21
0
07 Mar 2023
Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization
Lukas Haas
Silas Alberti
Michal Skreta
VLM
23
21
0
01 Feb 2023
Aerial View Localization with Reinforcement Learning: Towards Emulating Search-and-Rescue
Aleksis Pirinen
A. Samuelsson
John Backsund
Kalle Åström
39
4
0
08 Sep 2022
Visual and Object Geo-localization: A Comprehensive Survey
Daniel Wilson
Xiaohan Zhang
Waqas Sultani
S. Wshah
24
15
0
30 Dec 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
259
558
0
28 Sep 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
248
577
0
22 Apr 2021
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
176
686
0
22 Apr 2021
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
Stephen Hausler
Sourav Garg
Ming Xu
Michael Milford
Tobias Fischer
53
329
0
02 Mar 2021
CrossTransformers: spatially-aware few-shot transfer
Carl Doersch
Ankush Gupta
Andrew Zisserman
ViT
203
330
0
22 Jul 2020
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
195
3,529
0
20 Aug 2019
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
253
1,827
0
18 Aug 2016
1