Where in the World is this Image? Transformer-based Geo-localization in the Wild

29 April 2022

Papers citing "Where in the World is this Image? Transformer-based Geo-localization in the Wild"

24 / 24 papers shown

Title
LocDiffusion: Identifying Locations on Earth by Diffusing in the Hilbert Space Zhangyu Wang Jielu Zhang Zhongliang Zhou Qian Cao Nemin Wu ... Lan Mu Yang Song Yiqun Xie Ni Lao Gengchen Mai DiffM 34 0 0 23 Mar 2025
NAVIG: Natural Language-guided Analysis with Vision Language Models for Image Geo-localization Zheyuan Zhang Runze Li Tasnim Kabir Jordan Boyd-Graber 55 0 0 21 Feb 2025
Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework Zirui Song Jingpu Yang Yuan Huang Jonathan Tonglet Zeyu Zhang Tao Cheng Meng Fang Iryna Gurevych X. Chen LRM 65 1 0 19 Feb 2025
CityGuessr: City-Level Video Geo-Localization on a Global Scale P. Kulkarni Gaurav Kumar Nayak Mubarak Shah ViT AI4TS 29 2 0 10 Nov 2024
Statewide Visual Geolocalization in the Wild F. Fervers Sebastian Bullinger C. Bodensteiner Michael Arens Rainer Stiefelhagen 23 2 0 25 Sep 2024
Enhancing Worldwide Image Geolocation by Ensembling Satellite-Based Ground-Level Attribute Predictors Michael J. Bianco David Eigen Michael Gormish 22 0 0 18 Jul 2024
AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization Shixiong Xu Chenghao Zhang Lubin Fan Gaofeng Meng Shiming Xiang Jieping Ye VLM 41 4 0 11 Jul 2024
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization Anindya Sarkar S. Sastry Aleksis Pirinen Chongjie Zhang Nathan Jacobs Yevgeniy Vorobeychik 49 4 0 04 Jun 2024
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model Ling Li Yu Ye Bingchuan Jiang Wei Zeng VLM LRM 31 7 0 03 Jun 2024
G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models Pengyue Jia Yiding Liu Xiaopeng Li Xiangyu Zhao Yuhao Wang Yantong Du Xiao Han Xuetao Wei Shuaiqiang Wang Dawei Yin DiffM 34 6 0 23 May 2024
OpenStreetView-5M: The Many Roads to Global Visual Geolocation Guillaume Astruc Nicolas Dufour Ioannis Siglidis Constantin Aronssohn Nacim Bouia ... Charles Raude Elliot Vincent Lintao Xu Hongyu Zhou Loic Landrieu 27 6 0 29 Apr 2024
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation Zhongliang Zhou Jielu Zhang Zihan Guan Mengxuan Hu Ni Lao Lan Mu Sheng R. Li Gengchen Mai VLM 43 12 0 28 Mar 2024
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization V. Cepeda Gaurav Kumar Nayak Mubarak Shah 13 83 0 27 Sep 2023
Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes Brandon Clark Alec Kerrigan P. Kulkarni V. Cepeda M. Shah 19 21 0 07 Mar 2023
Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization Lukas Haas Silas Alberti Michal Skreta VLM 23 21 0 01 Feb 2023
Aerial View Localization with Reinforcement Learning: Towards Emulating Search-and-Rescue Aleksis Pirinen A. Samuelsson John Backsund Kalle Åström 39 4 0 08 Sep 2022
Visual and Object Geo-localization: A Comprehensive Survey Daniel Wilson Xiaohan Zhang Waqas Sultani S. Wshah 24 15 0 30 Dec 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding Hu Xu Gargi Ghosh Po-Yao (Bernie) Huang Dmytro Okhonko Armen Aghajanyan Florian Metze Luke Zettlemoyer Florian Metze Luke Zettlemoyer Christoph Feichtenhofer CLIP VLM 259 558 0 28 Sep 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text Hassan Akbari Liangzhe Yuan Rui Qian Wei-Hong Chuang Shih-Fu Chang Yin Cui Boqing Gong ViT 248 577 0 22 Apr 2021
ImageNet-21K Pretraining for the Masses T. Ridnik Emanuel Ben-Baruch Asaf Noy Lihi Zelnik-Manor SSeg VLM CLIP 176 686 0 22 Apr 2021
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition Stephen Hausler Sourav Garg Ming Xu Michael Milford Tobias Fischer 53 329 0 02 Mar 2021
CrossTransformers: spatially-aware few-shot transfer Carl Doersch Ankush Gupta Andrew Zisserman ViT 203 330 0 22 Jul 2020
Deep High-Resolution Representation Learning for Visual Recognition Jingdong Wang Ke Sun Tianheng Cheng Borui Jiang Chaorui Deng ... Yadong Mu Mingkui Tan Xinggang Wang Wenyu Liu Bin Xiao 195 3,529 0 20 Aug 2019
Semantic Understanding of Scenes through the ADE20K Dataset Bolei Zhou Hang Zhao Xavier Puig Tete Xiao Sanja Fidler Adela Barriuso Antonio Torralba SSeg 253 1,827 0 18 Aug 2016