Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2410.11087
Cited By
v1
v2 (latest)
Locality Alignment Improves Vision-Language Models
International Conference on Learning Representations (ICLR), 2024
14 October 2024
Ian Covert
Tony Sun
James Zou
Tatsunori Hashimoto
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Locality Alignment Improves Vision-Language Models"
23 / 123 papers shown
Title
TextCaps: a Dataset for Image Captioning with Reading Comprehension
European Conference on Computer Vision (ECCV), 2020
Oleksii Sidorov
Ronghang Hu
Marcus Rohrbach
Amanpreet Singh
307
493
0
24 Mar 2020
A Simple Framework for Contrastive Learning of Visual Representations
International Conference on Machine Learning (ICML), 2020
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
1.0K
21,811
0
13 Feb 2020
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
1.5K
8,651
0
02 Oct 2019
LVIS: A Dataset for Large Vocabulary Instance Segmentation
Computer Vision and Pattern Recognition (CVPR), 2019
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISeg
VLM
471
1,576
0
08 Aug 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Computer Vision and Pattern Recognition (CVPR), 2019
Kenneth Marino
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
490
1,336
0
31 May 2019
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
IEEE International Conference on Computer Vision (ICCV), 2019
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
1.4K
5,430
0
13 May 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
505
1,632
0
18 Apr 2019
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
188
158
0
29 Oct 2018
AutoAugment: Learning Augmentation Policies from Data
E. D. Cubuk
Barret Zoph
Dandelion Mané
Vijay Vasudevan
Quoc V. Le
612
1,886
0
24 May 2018
Unsupervised Representation Learning by Predicting Image Rotations
Spyros Gidaris
Praveer Singh
N. Komodakis
OOD
SSL
DRL
731
3,481
0
21 Mar 2018
mixup: Beyond Empirical Risk Minimization
International Conference on Learning Representations (ICLR), 2017
Hongyi Zhang
Moustapha Cissé
Yann N. Dauphin
David Lopez-Paz
NoLa
544
10,936
0
25 Oct 2017
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
904
3,723
0
02 Dec 2016
Semantic Understanding of Scenes through the ADE20K Dataset
International Journal of Computer Vision (IJCV), 2016
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
635
2,123
0
18 Aug 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
422
1,483
0
31 Jul 2016
Fully Convolutional Networks for Semantic Segmentation
Evan Shelhamer
Jonathan Long
Trevor Darrell
VOS
SSeg
1.1K
40,321
0
20 May 2016
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles
M. Noroozi
Paolo Favaro
SSL
593
3,138
0
30 Mar 2016
A Diagram Is Worth A Dozen Images
Aniruddha Kembhavi
M. Salvato
Eric Kolve
Minjoon Seo
Hannaneh Hajishirzi
Ali Farhadi
3DV
196
724
0
24 Mar 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
868
6,156
0
23 Feb 2016
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
890
6,014
0
03 May 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
721
21,991
0
09 Mar 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions
Computer Vision and Pattern Recognition (CVPR), 2014
A. Karpathy
Li Fei-Fei
454
5,830
0
07 Dec 2014
Microsoft COCO: Common Objects in Context
European Conference on Computer Vision (ECCV), 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
8.1K
48,609
0
01 May 2014
Visualizing and Understanding Convolutional Networks
European Conference on Computer Vision (ECCV), 2013
Matthew D. Zeiler
Rob Fergus
FAtt
SSL
922
16,538
0
12 Nov 2013
Previous
1
2
3