Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.19596
Cited By
LocCa: Visual Pretraining with Location-aware Captioners
28 March 2024
Bo Wan
Michael Tschannen
Yongqin Xian
Filip Pavetić
Ibrahim Alabdulmohsin
Xiao Wang
André Susano Pinto
Andreas Steiner
Lucas Beyer
Xiao-Qi Zhai
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LocCa: Visual Pretraining with Location-aware Captioners"
13 / 63 papers shown
Title
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
106
2,262
0
01 Mar 2017
YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video
Esteban Real
Jonathon Shlens
S. Mazzocchi
Xin Pan
Vincent Vanhoucke
VOS
ObjD
95
534
0
02 Feb 2017
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
480
22,134
0
09 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
345
3,270
0
02 Dec 2016
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
139
446
0
01 Dec 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
404
1,889
0
18 Aug 2016
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
883
27,412
0
02 Dec 2015
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
129
1,170
0
24 Nov 2015
Learning Visual Features from Large Weakly Supervised Data
Armand Joulin
Laurens van der Maaten
Allan Jabri
Nicolas Vasilache
SSL
107
408
0
06 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
520
62,360
0
04 Jun 2015
Fast R-CNN
Ross B. Girshick
ObjD
309
25,059
0
30 Apr 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
215
2,489
0
01 Apr 2015
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
295
4,508
0
20 Nov 2014
Previous
1
2