Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,611 papers shown
Title
Revisiting a kNN-based Image Classification System with High-capacity Storage
K. Nakata
Youyang Ng
Daisuke Miyashita
A. Maki
Yu Lin
J. Deguchi
19
26
0
03 Apr 2022
Improving Vision Transformers by Revisiting High-frequency Components
Jiawang Bai
Liuliang Yuan
Shutao Xia
Shuicheng Yan
Zhifeng Li
W. Liu
ViT
8
90
0
03 Apr 2022
POS-BERT: Point Cloud One-Stage BERT Pre-Training
Kexue Fu
Peng Gao
Shaolei Liu
Renrui Zhang
Yu Qiao
Manning Wang
3DPC
22
18
0
03 Apr 2022
UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation
Ali Hatamizadeh
Ziyue Xu
Dong Yang
Wenqi Li
H. Roth
Daguang Xu
ViT
MedIm
21
29
0
01 Apr 2022
Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification
Yang Luo
Zhineng Chen
Shengtian Zhou
Xieping Gao
20
1
0
31 Mar 2022
MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
Alan Baade
Puyuan Peng
David F. Harwath
23
95
0
30 Mar 2022
Exploring Plain Vision Transformer Backbones for Object Detection
Yanghao Li
Hanzi Mao
Ross B. Girshick
Kaiming He
ViT
25
774
0
30 Mar 2022
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training
Xiaotong Li
Yixiao Ge
Kun Yi
Zixuan Hu
Ying Shan
Ling-yu Duan
29
38
0
29 Mar 2022
In-N-Out Generative Learning for Dense Unsupervised Video Segmentation
Xiaomiao Pan
Peike Li
Zongxin Yang
Huiling Zhou
Chang Zhou
Hongxia Yang
Jingren Zhou
Yi Yang
VOS
19
11
0
29 Mar 2022
Large-scale Bilingual Language-Image Contrastive Learning
ByungSoo Ko
Geonmo Gu
VLM
17
14
0
28 Mar 2022
Mugs: A Multi-Granular Self-Supervised Learning Framework
Pan Zhou
Yichen Zhou
Chenyang Si
Weihao Yu
Teck Khim Ng
Shuicheng Yan
VLM
29
60
0
27 Mar 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
Yunjie Tian
Lingxi Xie
Jiemin Fang
Mengnan Shi
Junran Peng
Xiaopeng Zhang
Jianbin Jiao
Qi Tian
QiXiang Ye
23
19
0
27 Mar 2022
3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds
Junsheng Zhou
Xin Wen
Baorui Ma
Yu-Shen Liu
Yue Gao
Yi Fang
Zhizhong Han
3DPC
28
17
0
26 Mar 2022
On the Viability of Monocular Depth Pre-training for Semantic Segmentation
Dong Lao
Fengyu Yang
Daniel Wang
Hyoungseob Park
Samuel Lu
Alex Wong
Stefano Soatto
MDE
18
0
0
26 Mar 2022
Reinforcement Learning with Action-Free Pre-Training from Videos
Younggyo Seo
Kimin Lee
Stephen James
Pieter Abbeel
SSL
OnRL
16
115
0
25 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
111
1,120
0
23 Mar 2022
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge J. Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
40
1,514
0
23 Mar 2022
Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework
Botao Ye
Hong Chang
Bingpeng Ma
Shiguang Shan
Xilin Chen
ViT
13
275
0
22 Mar 2022
Unsupervised Anomaly Detection in Medical Images with a Memory-augmented Multi-level Cross-attentional Masked Autoencoder
Yu Tian
Guansong Pang
Yuyuan Liu
Chong Wang
Yuanhong Chen
Fengbei Liu
Rajvinder Singh
Johan W. Verjans
Mengyu Wang
G. Carneiro
ViT
17
24
0
22 Mar 2022
Root-aligned SMILES: A Tight Representation for Chemical Reaction Prediction
Zipeng Zhong
Jie Song
Zunlei Feng
Tiantao Liu
Lingxiang Jia
Shaolun Yao
Min-Ying Wu
Tingjun Hou
Mingli Song
21
52
0
22 Mar 2022
Representation Uncertainty in Self-Supervised Learning as Variational Inference
Hiroki Nakamura
Masashi Okada
T. Taniguchi
17
18
0
22 Mar 2022
Test-time Adaptation with Slot-Centric Models
Mihir Prabhudesai
Anirudh Goyal
S. Paul
Sjoerd van Steenkiste
Mehdi S. M. Sajjadi
Gaurav Aggarwal
Thomas Kipf
Deepak Pathak
Katerina Fragkiadaki
TTA
16
8
0
21 Mar 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu
Mu Cai
Yong Jae Lee
3DPC
21
163
0
21 Mar 2022
MixFormer: End-to-End Tracking with Iterative Mixed Attention
Yutao Cui
Jiang Cheng
Limin Wang
Gangshan Wu
VOT
23
452
0
21 Mar 2022
Upsampling Autoencoder for Self-Supervised Point Cloud Learning
Cheng Zhang
Jian Shi
X. Deng
Zizhao Wu
3DPC
27
8
0
21 Mar 2022
simCrossTrans: A Simple Cross-Modality Transfer Learning for Object Detection with ConvNets or Vision Transformers
Xiaoke Shen
I. Stamos
ViT
18
5
0
20 Mar 2022
Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion
Zhiqiang Yan
Xiang Li
Kun Wang
Zhenyu Zhang
Jun Yu Li
Jian Yang
MDE
29
32
0
18 Mar 2022
Three things everyone should know about Vision Transformers
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Hervé Jégou
ViT
16
119
0
18 Mar 2022
Emerging Artificial Intelligence Applications in Spatial Transcriptomics Analysis
Yijun Li
Stefan Stanojevic
L. Garmire
15
24
0
18 Mar 2022
GATE: Graph CCA for Temporal SElf-supervised Learning for Label-efficient fMRI Analysis
Liang Peng
Nan Wang
Jie Xu
Xiao-lan Zhu
Xiaoxiao Li
22
33
0
17 Mar 2022
Object discovery and representation networks
Olivier J. Hénaff
Skanda Koppula
Evan Shelhamer
Daniel Zoran
Andrew Jaegle
Andrew Zisserman
João Carreira
Relja Arandjelović
33
87
0
16 Mar 2022
Weak Augmentation Guided Relational Self-Supervised Learning
Mingkai Zheng
Shan You
Fei Wang
Chao Qian
Changshui Zhang
Xiaogang Wang
Chang Xu
24
4
0
16 Mar 2022
Pushing the limits of raw waveform speaker recognition
Jee-weon Jung
You Jin Kim
Hee-Soo Heo
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
23
87
0
16 Mar 2022
Data Efficient 3D Learner via Knowledge Transferred from 2D Model
Ping Yu
Cheng Sun
Min Sun
3DPC
20
11
0
16 Mar 2022
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation
Wenkang Shan
Zhenhua Liu
Xinfeng Zhang
Shanshe Wang
Siwei Ma
Wen Gao
3DH
21
121
0
15 Mar 2022
SuperAnimal pretrained pose estimation models for behavioral analysis
Shaokai Ye
Anastasiia Filippova
Jessy Lauer
Steffen Schneider
Maxime Vidal
Tian Qiu
Alexander Mathis
Mackenzie W. Mathis
21
26
0
14 Mar 2022
Rethinking Minimal Sufficient Representation in Contrastive Learning
Haoqing Wang
Xun Guo
Zhiwei Deng
Yan Lu
SSL
6
73
0
14 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
W. Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
31
451
0
13 Mar 2022
Masked Visual Pre-training for Motor Control
Tete Xiao
Ilija Radosavovic
Trevor Darrell
Jitendra Malik
SSL
21
241
0
11 Mar 2022
Active Token Mixer
Guoqiang Wei
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
10
15
0
11 Mar 2022
Visualizing and Understanding Patch Interactions in Vision Transformer
Jie Ma
Yalong Bai
Bineng Zhong
Wei Zhang
Ting Yao
Tao Mei
ViT
8
32
0
11 Mar 2022
Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation
Lei Zhou
Huidong Liu
Joseph Bae
Junjun He
Dimitris Samaras
Prateek Prasanna
MedIm
ViT
9
64
0
10 Mar 2022
Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking
Boyu Chen
Peixia Li
Lei Bai
Leixian Qiao
Qiuhong Shen
Bo-wen Li
Weihao Gan
Wei Wu
Wanli Ouyang
ViT
VOT
20
182
0
10 Mar 2022
MVP: Multimodality-guided Visual Pre-training
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
26
105
0
10 Mar 2022
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement
Mohamed Ali Souibgui
Sanket Biswas
Andrés Mafla
Ali Furkan Biten
Alicia Fornés
Yousri Kessentini
Josep Lladós
Lluís Gómez
Dimosthenis Karatzas
13
23
0
09 Mar 2022
Multiscale Convolutional Transformer with Center Mask Pretraining for Hyperspectral Image Classification
Sen Jia
Yifan Wang
ViT
33
13
0
09 Mar 2022
Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification
Zhiyuan Cai
Li Lin
Huaqing He
Xiaoying Tang
ViT
MedIm
13
28
0
09 Mar 2022
Domain Generalization using Pretrained Models without Fine-tuning
Ziyue Li
Kan Ren
Xinyang Jiang
Bo-wen Li
Haipeng Zhang
Dongsheng Li
VLM
20
37
0
09 Mar 2022
Gait Recognition with Mask-based Regularization
Chuanfu Shen
Beibei Lin
Shunli Zhang
George Q. Huang
Shiqi Yu
Xin-cen Yu
CVBM
33
17
0
08 Mar 2022
Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers
Miguel A. Saavedra-Ruiz
Sacha Morin
Liam Paull
MDE
ViT
25
3
0
07 Mar 2022
Previous
1
2
3
...
89
90
91
92
93
Next