ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11087
  4. Cited By
Locality Alignment Improves Vision-Language Models
v1v2 (latest)

Locality Alignment Improves Vision-Language Models

International Conference on Learning Representations (ICLR), 2024
14 October 2024
Ian Covert
Tony Sun
James Zou
Tatsunori Hashimoto
    VLM
ArXiv (abs)PDFHTML

Papers citing "Locality Alignment Improves Vision-Language Models"

23 / 123 papers shown
Title
TextCaps: a Dataset for Image Captioning with Reading Comprehension
TextCaps: a Dataset for Image Captioning with Reading ComprehensionEuropean Conference on Computer Vision (ECCV), 2020
Oleksii Sidorov
Ronghang Hu
Marcus Rohrbach
Amanpreet Singh
307
493
0
24 Mar 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsInternational Conference on Machine Learning (ICML), 2020
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
1.0K
21,811
0
13 Feb 2020
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
1.5K
8,651
0
02 Oct 2019
LVIS: A Dataset for Large Vocabulary Instance Segmentation
LVIS: A Dataset for Large Vocabulary Instance SegmentationComputer Vision and Pattern Recognition (CVPR), 2019
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISegVLM
471
1,576
0
08 Aug 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External
  Knowledge
OK-VQA: A Visual Question Answering Benchmark Requiring External KnowledgeComputer Vision and Pattern Recognition (CVPR), 2019
Kenneth Marino
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
490
1,336
0
31 May 2019
CutMix: Regularization Strategy to Train Strong Classifiers with
  Localizable Features
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable FeaturesIEEE International Conference on Computer Vision (ICCV), 2019
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
1.4K
5,430
0
13 May 2019
Towards VQA Models That Can Read
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
505
1,632
0
18 Apr 2019
TallyQA: Answering Complex Counting Questions
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
188
158
0
29 Oct 2018
AutoAugment: Learning Augmentation Policies from Data
AutoAugment: Learning Augmentation Policies from Data
E. D. Cubuk
Barret Zoph
Dandelion Mané
Vijay Vasudevan
Quoc V. Le
612
1,886
0
24 May 2018
Unsupervised Representation Learning by Predicting Image Rotations
Unsupervised Representation Learning by Predicting Image Rotations
Spyros Gidaris
Praveer Singh
N. Komodakis
OODSSLDRL
731
3,481
0
21 Mar 2018
mixup: Beyond Empirical Risk Minimization
mixup: Beyond Empirical Risk MinimizationInternational Conference on Learning Representations (ICLR), 2017
Hongyi Zhang
Moustapha Cissé
Yann N. Dauphin
David Lopez-Paz
NoLa
544
10,936
0
25 Oct 2017
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
904
3,723
0
02 Dec 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Semantic Understanding of Scenes through the ADE20K DatasetInternational Journal of Computer Vision (IJCV), 2016
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
635
2,123
0
18 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
422
1,483
0
31 Jul 2016
Fully Convolutional Networks for Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation
Evan Shelhamer
Jonathan Long
Trevor Darrell
VOSSSeg
1.1K
40,321
0
20 May 2016
Unsupervised Learning of Visual Representations by Solving Jigsaw
  Puzzles
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles
M. Noroozi
Paolo Favaro
SSL
593
3,138
0
30 Mar 2016
A Diagram Is Worth A Dozen Images
A Diagram Is Worth A Dozen Images
Aniruddha Kembhavi
M. Salvato
Eric Kolve
Minjoon Seo
Hannaneh Hajishirzi
Ali Farhadi
3DV
196
724
0
24 Mar 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
868
6,156
0
23 Feb 2016
VQA: Visual Question Answering
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
890
6,014
0
03 May 2015
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
721
21,991
0
09 Mar 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions
Deep Visual-Semantic Alignments for Generating Image DescriptionsComputer Vision and Pattern Recognition (CVPR), 2014
A. Karpathy
Li Fei-Fei
454
5,830
0
07 Dec 2014
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
8.1K
48,609
0
01 May 2014
Visualizing and Understanding Convolutional Networks
Visualizing and Understanding Convolutional NetworksEuropean Conference on Computer Vision (ECCV), 2013
Matthew D. Zeiler
Rob Fergus
FAttSSL
922
16,538
0
12 Nov 2013
Previous
123