Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.04560
Cited By
Scaling Vision Transformers
8 June 2021
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Vision Transformers"
50 / 751 papers shown
Title
The Curious Case of Benign Memorization
Sotiris Anagnostidis
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
AAML
49
8
0
25 Oct 2022
The Robustness Limits of SoTA Vision Models to Natural Variation
Mark Ibrahim
Q. Garrido
Ari S. Morcos
Diane Bouchacourt
VLM
43
16
0
24 Oct 2022
Precision Machine Learning
Eric J. Michaud
Ziming Liu
Max Tegmark
24
34
0
24 Oct 2022
Window-Based Distribution Shift Detection for Deep Neural Networks
Guy Bar-Shalom
Yonatan Geifman
Ran El-Yaniv
20
3
0
19 Oct 2022
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
51
422
0
17 Oct 2022
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
57
3,276
0
16 Oct 2022
Active Learning from the Web
Ryoma Sato
27
0
0
15 Oct 2022
RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer
Jian Wang
Chen-xi Gou
Qiman Wu
Haocheng Feng
Junyu Han
Errui Ding
Jingdong Wang
ViT
36
96
0
13 Oct 2022
Probabilistic Integration of Object Level Annotations in Chest X-ray Classification
Tom van Sonsbeek
Xiantong Zhen
Dwarikanath Mahapatra
M. Worring
31
12
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
VLM
22
39
0
12 Oct 2022
On Divergence Measures for Bayesian Pseudocoresets
Balhae Kim
J. Choi
Seanie Lee
Yoonho Lee
Jung-Woo Ha
Juho Lee
DD
19
11
0
12 Oct 2022
Gastrointestinal Disorder Detection with a Transformer Based Approach
A. Hosain
Mynul Islam
Md Humaion Kabir Mehedi
Irteza Enan Kabir
Zarin Tasnim Khan
ViT
MedIm
22
22
0
06 Oct 2022
Real-World Robot Learning with Masked Visual Pre-training
Ilija Radosavovic
Tete Xiao
Stephen James
Pieter Abbeel
Jitendra Malik
Trevor Darrell
SSL
156
241
0
06 Oct 2022
Generalization Properties of Retrieval-based Models
Soumya Basu
A. S. Rawat
Manzil Zaheer
31
6
0
06 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
56
371
0
05 Oct 2022
Fine-Tuning with Differential Privacy Necessitates an Additional Hyperparameter Search
Yannis Cattan
Christopher A. Choquette-Choo
Nicolas Papernot
Abhradeep Thakurta
26
20
0
05 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
39
59
0
04 Oct 2022
Optimizing Data Collection for Machine Learning
Rafid Mahmood
James Lucas
J. Álvarez
Sanja Fidler
M. Law
93
26
0
03 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
35
25
0
03 Oct 2022
Towards a Unified View on Visual Parameter-Efficient Transfer Learning
Bruce X. B. Yu
Jianlong Chang
Lin Liu
Qi Tian
Changan Chen
VPVLM
VLM
68
34
0
03 Oct 2022
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann
C. Gros
32
26
0
29 Sep 2022
UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation
Xin Yu
Qi Yang
Yinchi Zhou
L. Cai
Riqiang Gao
...
R. Abramson
Zizhao Zhang
Yuankai Huo
Bennett A. Landman
Yucheng Tang
ViT
MedIm
42
0
0
28 Sep 2022
Transfer Learning with Pretrained Remote Sensing Transformers
A. Fuller
K. Millard
J.R. Green
33
11
0
28 Sep 2022
Attacking Compressed Vision Transformers
Swapnil Parekh
Devansh Shah
Pratyush Shukla
AAML
24
1
0
28 Sep 2022
Scaling Laws For Deep Learning Based Image Reconstruction
Tobit Klug
Reinhard Heckel
62
12
0
27 Sep 2022
Greybox XAI: a Neural-Symbolic learning framework to produce interpretable predictions for image classification
Adrien Bennetot
Gianni Franchi
Javier Del Ser
Raja Chatila
Natalia Díaz Rodríguez
AAML
32
28
0
26 Sep 2022
Multi-dataset Training of Transformers for Robust Action Recognition
Junwei Liang
Enwei Zhang
Jun Zhang
Chunhua Shen
ViT
42
11
0
26 Sep 2022
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
99
77
0
15 Sep 2022
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLM
VLM
37
688
0
14 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M. Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
159
102
0
13 Sep 2022
Socially Enhanced Situation Awareness from Microblogs using Artificial Intelligence: A Survey
Rabindra Lamsal
Aaron Harwood
M. Read
40
20
0
13 Sep 2022
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
140
29
0
12 Sep 2022
Enabling Connectivity for Automated Mobility: A Novel MQTT-based Interface Evaluated in a 5G Case Study on Edge-Cloud Lidar Object Detection
Lennart Reiher
Bastian Lampe
Timo Woopen
Raphael van Kempen
Till Beemelmanns
L. Eckstein
16
8
0
08 Sep 2022
Statistical Foundation Behind Machine Learning and Its Impact on Computer Vision
Lei Zhang
H. Shum
VLM
SSL
22
2
0
06 Sep 2022
IMG2IMU: Translating Knowledge from Large-Scale Images to IMU Sensing Applications
Hyungjun Yoon
Hyeong-Tae Cha
Hoang C. Nguyen
Taesik Gong
Sungyeop Lee
VLM
SSL
32
0
0
02 Sep 2022
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment
Mustafa Shukor
Guillaume Couairon
Matthieu Cord
VLM
CLIP
24
27
0
29 Aug 2022
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang
Hangbo Bao
Li Dong
Johan Bjorck
Zhiliang Peng
...
Kriti Aggarwal
O. Mohammed
Saksham Singhal
Subhojit Som
Furu Wei
MLLM
VLM
ViT
54
629
0
22 Aug 2022
Understanding Scaling Laws for Recommendation Models
Newsha Ardalani
Carole-Jean Wu
Zeliang Chen
Bhargav Bhushanam
Adnan Aziz
39
28
0
17 Aug 2022
Boosting Distributed Training Performance of the Unpadded BERT Model
Jinle Zeng
Min Li
Zhihua Wu
Jiaqi Liu
Yuang Liu
Dianhai Yu
Yanjun Ma
17
10
0
17 Aug 2022
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
29
306
0
12 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
Lefei Zhang
36
242
0
08 Aug 2022
Frozen CLIP Models are Efficient Video Learners
Ziyi Lin
Shijie Geng
Renrui Zhang
Peng Gao
Gerard de Melo
Xiaogang Wang
Jifeng Dai
Yu Qiao
Hongsheng Li
CLIP
VLM
16
200
0
06 Aug 2022
P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting
Ziyi Wang
Xumin Yu
Yongming Rao
Jie Zhou
Jiwen Lu
VPVLM
VLM
24
75
0
04 Aug 2022
Leveraging the HW/SW Optimizations and Ecosystems that Drive the AI Revolution
H. Carvalho
P. Zaykov
Asim Ukaye
22
1
0
04 Aug 2022
Learning Prior Feature and Attention Enhanced Image Inpainting
Chenjie Cao
Qiaole Dong
Yanwei Fu
DiffM
33
25
0
03 Aug 2022
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
16
15
0
28 Jul 2022
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment
Qiang Chen
Xiaokang Chen
Jian Wang
Shan Zhang
Kun Yao
Haocheng Feng
Junyu Han
Errui Ding
Gang Zeng
Jingdong Wang
ViT
49
120
0
26 Jul 2022
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
Kan Wu
Jinnian Zhang
Houwen Peng
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
21
246
0
21 Jul 2022
Pretraining a Neural Network before Knowing Its Architecture
Boris Knyazev
AI4CE
27
1
0
20 Jul 2022
Previous
1
2
3
...
11
12
13
14
15
16
Next