Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.17863
Cited By
Vision Transformers with Natural Language Semantics
27 February 2024
Young-Kyung Kim
Matías Di Martino
Guillermo Sapiro
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision Transformers with Natural Language Semantics"
17 / 17 papers shown
Title
FFCV: Accelerating Training by Removing Data Bottlenecks
Guillaume Leclerc
Andrew Ilyas
Logan Engstrom
Sung Min Park
Hadi Salman
Aleksander Madry
41
68
0
21 Jun 2023
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
166
710
0
14 Nov 2022
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
Feng Li
Hao Zhang
Hu-Sheng Xu
Siyi Liu
Lei Zhang
L. Ni
H. Shum
ISeg
113
382
0
06 Jun 2022
AdaViT: Adaptive Tokens for Efficient Vision Transformer
Hongxu Yin
Arash Vahdat
J. Álvarez
Arun Mallya
Jan Kautz
Pavlo Molchanov
ViT
79
336
0
14 Dec 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
231
2,812
0
15 Jun 2021
This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks
Adrian Hoffmann
Claudio Fanconi
Rahul Rade
Jonas Köhler
43
63
0
05 May 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
415
21,347
0
25 Mar 2021
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
Dan Hendrycks
Steven Basart
Norman Mu
Saurav Kadavath
Frank Wang
...
Samyak Parajuli
Mike Guo
D. Song
Jacob Steinhardt
Justin Gilmer
OOD
310
1,727
0
29 Jun 2020
RandAugment: Practical automated data augmentation with a reduced search space
E. D. Cubuk
Barret Zoph
Jonathon Shlens
Quoc V. Le
MQ
208
3,480
0
30 Sep 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
227
2,474
0
20 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
130
1,948
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
217
3,667
0
06 Aug 2019
Natural Adversarial Examples
Dan Hendrycks
Kevin Zhao
Steven Basart
Jacob Steinhardt
D. Song
OODD
193
1,465
0
16 Jul 2019
Learning Robust Global Representations by Penalizing Local Predictive Power
Haohan Wang
Songwei Ge
Eric Xing
Zachary Chase Lipton
OOD
109
957
0
29 May 2019
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
264
19,929
0
07 Oct 2016
Describing Textures in the Wild
Mircea Cimpoi
Subhransu Maji
Iasonas Kokkinos
S. Mohamed
Andrea Vedaldi
3DV
106
2,661
0
14 Nov 2013
Fine-Grained Visual Classification of Aircraft
Subhransu Maji
Esa Rahtu
Arno Solin
Matthew Blaschko
Andrea Vedaldi
109
2,252
0
21 Jun 2013
1