Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,777 papers shown
Title
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjD
VLM
MLLM
171
412
0
17 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
92
26
0
17 Jun 2022
DGMIL: Distribution Guided Multiple Instance Learning for Whole Slide Image Classification
Linhao Qu
Xiao-Zhuo Luo
Shaolei Liu
Manning Wang
Zhijian Song
VLM
88
57
0
17 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
150
388
0
17 Jun 2022
Masked Autoencoders for Generic Event Boundary Detection CVPR'2022 Kinetics-GEBD Challenge
Ruifei He
Yuanxi Sun
Youzeng Li
Zuwei Huang
Feng Hu
Xu Cheng
Jie Tang
73
3
0
17 Jun 2022
Multi-Contextual Predictions with Vision Transformer for Video Anomaly Detection
Joo-Yeon Lee
Woo-Jeoung Nam
Seong-Whan Lee
ViT
61
14
0
17 Jun 2022
Rectify ViT Shortcut Learning by Visual Saliency
Chong Ma
Lin Zhao
Yuzhong Chen
David Liu
Xi Jiang
Tuo Zhang
Xintao Hu
Dinggang Shen
Dajiang Zhu
Tianming Liu
ViT
108
20
0
17 Jun 2022
MET: Masked Encoding for Tabular Data
Kushal Majmundar
Sachin Goyal
Praneeth Netrapalli
Prateek Jain
LMTD
59
0
0
17 Jun 2022
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar
Alaaeldin El-Nouby
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
ViT
114
99
0
16 Jun 2022
iBoot: Image-bootstrapped Self-Supervised Video Representation Learning
F. Saleh
Fuwen Tan
Adrian Bulat
Georgios Tzimiropoulos
Brais Martínez
SSL
99
1
0
16 Jun 2022
Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency
Viraj Prabhu
Sriram Yenamandra
Aaditya K. Singh
Judy Hoffman
64
15
0
16 Jun 2022
Evaluating Self-Supervised Learning for Molecular Graph Embeddings
Hanchen Wang
Jean Kaddour
Shengchao Liu
Jian Tang
Joan Lasenby
Qi Liu
121
23
0
16 Jun 2022
On Privacy and Personalization in Cross-Silo Federated Learning
Ziyu Liu
Shengyuan Hu
Zhiwei Steven Wu
Virginia Smith
FedML
113
56
0
16 Jun 2022
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Jiahao Xie
Wei Li
Xiaohang Zhan
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
115
74
0
15 Jun 2022
Masked Siamese ConvNets
L. Jing
Jiachen Zhu
Yann LeCun
SSL
114
35
0
15 Jun 2022
Write and Paint: Generative Vision-Language Models are Unified Modal Learners
Shizhe Diao
Wangchunshu Zhou
Xinsong Zhang
Jiawei Wang
MLLM
AI4CE
95
17
0
15 Jun 2022
A Simple Data Mixing Prior for Improving Self-Supervised Learning
Sucheng Ren
Huiyu Wang
Zhengqi Gao
Shengfeng He
Alan Yuille
Yuyin Zhou
Cihang Xie
51
35
0
15 Jun 2022
Rethinking Generalization in Few-Shot Classification
Markus Hiller
Rongkai Ma
Mehrtash Harandi
Tom Drummond
OCL
VLM
112
57
0
15 Jun 2022
Exploring Adversarial Attacks and Defenses in Vision Transformers trained with DINO
Javier Rando
Nasib Naimi
Thomas Baumann
Max Mathys
AAML
55
6
0
14 Jun 2022
LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
Yi-Lin Sung
Jaemin Cho
Joey Tianyi Zhou
VLM
99
246
0
13 Jun 2022
Multimodal Learning with Transformers: A Survey
Peng Xu
Xiatian Zhu
David Clifton
ViT
236
575
0
13 Jun 2022
RPLHR-CT Dataset and Transformer Baseline for Volumetric Super-Resolution from CT Scans
Pengxin Yu
Haoyue Zhang
Han Kang
Wen Tang
C. Arnold
Rongguo Zhang
ViT
OOD
MedIm
56
20
0
13 Jun 2022
GLIPv2: Unifying Localization and Vision-Language Understanding
Haotian Zhang
Pengchuan Zhang
Xiaowei Hu
Yen-Chun Chen
Liunian Harold Li
Xiyang Dai
Lijuan Wang
Lu Yuan
Lei Li
Jianfeng Gao
ObjD
VLM
97
302
0
12 Jun 2022
Bootstrapping Multi-view Representations for Fake News Detection
Qichao Ying
Xiaoxiao Hu
Yangming Zhou
Zhenxing Qian
Dan Zeng
Shiming Ge
76
50
0
12 Jun 2022
APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking
Yuxiang Yang
Junjie Yang
Yufei Xu
Jing Zhang
Long Lan
Dacheng Tao
93
44
0
12 Jun 2022
Does Self-supervised Learning Really Improve Reinforcement Learning from Pixels?
Xiang Li
Jinghuan Shang
Srijan Das
Michael S. Ryoo
SSL
104
33
0
10 Jun 2022
Is Self-Supervised Learning More Robust Than Supervised Learning?
Yuanyi Zhong
Haoran Tang
Jun-Kun Chen
Jian-wei Peng
Yu-Xiong Wang
SSL
OOD
77
25
0
10 Jun 2022
SERE: Exploring Feature Self-relation for Self-supervised Transformer
Zhong-Yu Li
Shanghua Gao
Ming-Ming Cheng
ViT
MDE
101
14
0
10 Jun 2022
Saccade Mechanisms for Image Classification, Object Detection and Tracking
Saurabh Farkya
Z. Daniels
Aswin Raghavan
David C. Zhang
M. Piacentino
69
3
0
10 Jun 2022
Positional Label for Self-Supervised Vision Transformer
Zhemin Zhang
Xun Gong
ViT
MDE
59
6
0
10 Jun 2022
Seeing the forest and the tree: Building representations of both individual and collective dynamics with transformers
Ran Liu
Mehdi Azabou
M. Dabagia
Jingyun Xiao
Eva L. Dyer
AI4CE
95
19
0
10 Jun 2022
Learning to Estimate Shapley Values with Vision Transformers
Ian Covert
Chanwoo Kim
Su-In Lee
FAtt
63
39
0
10 Jun 2022
Masked Autoencoders are Robust Data Augmentors
Haohang Xu
Shuangrui Ding
Xiaopeng Zhang
H. Xiong
139
28
0
10 Jun 2022
Extreme Masking for Learning Instance and Distributed Visual Representations
Zhirong Wu
Zihang Lai
Xiao Sun
Stephen Lin
106
22
0
09 Jun 2022
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
100
57
0
09 Jun 2022
Spatial Entropy as an Inductive Bias for Vision Transformers
E. Peruzzo
E. Sangineto
Yahui Liu
Marco De Nadai
Wei Bi
Bruno Lepri
N. Sebe
ViT
MDE
123
2
0
09 Jun 2022
Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
82
29
0
09 Jun 2022
CASS: Cross Architectural Self-Supervision for Medical Image Analysis
Pranav Singh
E. Sizikova
Jacopo Cirrone
OOD
173
8
0
08 Jun 2022
Robust Semantic Communications with Masked VQ-VAE Enabled Codebook
Qiyu Hu
Guangyi Zhang
Zhijin Qin
Yunlong Cai
Guanding Yu
Geoffrey Ye Li
AAML
96
150
0
08 Jun 2022
Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks
Jia Pan
Pan Zhou
Shuicheng Yan
SSL
89
17
0
08 Jun 2022
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Yang Shu
Zhangjie Cao
Ziyang Zhang
Jianmin Wang
Mingsheng Long
70
4
0
08 Jun 2022
Delving into the Pre-training Paradigm of Monocular 3D Object Detection
Zhuoling Li
Chuanrui Zhang
En Yu
Haoqian Wang
37
1
0
08 Jun 2022
Can CNNs Be More Robust Than Transformers?
Zeyu Wang
Yutong Bai
Yuyin Zhou
Cihang Xie
UQCV
OOD
115
46
0
07 Jun 2022
Siamese Encoder-based Spatial-Temporal Mixer for Growth Trend Prediction of Lung Nodules on CT Scans
Jiansheng Fang
Jingwen Wang
Anwei Li
Yuguang Yan
Yonghe Hou
Chao Song
Hongbo Liu
Jiang Liu
31
7
0
07 Jun 2022
Masked Unsupervised Self-training for Label-free Image Classification
Junnan Li
Silvio Savarese
Steven C. H. Hoi
VLM
SSL
45
13
0
07 Jun 2022
Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning
Richard J. Chen
Chengkuan Chen
Yicong Li
Tiffany Y. Chen
A. Trister
Rahul G. Krishnan
Faisal Mahmood
ViT
MedIm
127
432
0
06 Jun 2022
Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning
Yuandong Tian
MLT
125
14
0
02 Jun 2022
Siamese Image Modeling for Self-Supervised Vision Representation Learning
Chenxin Tao
Xizhou Zhu
Weijie Su
Gao Huang
Bin Li
Jie Zhou
Yu Qiao
Xiaogang Wang
Jifeng Dai
SSL
109
96
0
02 Jun 2022
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
175
46
0
02 Jun 2022
VL-BEiT: Generative Vision-Language Pretraining
Hangbo Bao
Wenhui Wang
Li Dong
Furu Wei
VLM
84
45
0
02 Jun 2022
Previous
1
2
3
...
89
90
91
...
94
95
96
Next