Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,777 papers shown
Title
3D UX-Net: A Large Kernel Volumetric ConvNet Modernizing Hierarchical Transformer for Medical Image Segmentation
Ho Hin Lee
Shunxing Bao
Yuankai Huo
Bennett A. Landman
OOD
MedIm
159
143
0
29 Sep 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
75
5
0
29 Sep 2022
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
114
73
0
29 Sep 2022
Variance Covariance Regularization Enforces Pairwise Independence in Self-Supervised Representations
Grégoire Mialon
Randall Balestriero
Yann LeCun
98
10
0
29 Sep 2022
Bridging the Gap to Real-World Object-Centric Learning
Maximilian Seitzer
Max Horn
Andrii Zadaianchuk
Dominik Zietlow
Tianjun Xiao
...
Tong He
Zheng Zhang
Bernhard Schölkopf
Thomas Brox
Francesco Locatello
OCL
139
153
0
29 Sep 2022
Efficient Medical Image Assessment via Self-supervised Learning
Chun-Yin Huang
Qi Lei
Xiaoxiao Li
54
2
0
28 Sep 2022
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Kundan Krishna
Saurabh Garg
Jeffrey P. Bigham
Zachary Chase Lipton
108
33
0
28 Sep 2022
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Jonah Anton
H. Coppock
Pancham Shukla
Bjorn W. Schuller
BDL
SSL
83
8
0
28 Sep 2022
Transfer Learning with Pretrained Remote Sensing Transformers
A. Fuller
K. Millard
J.R. Green
70
11
0
28 Sep 2022
TVLT: Textless Vision-Language Transformer
Zineng Tang
Jaemin Cho
Yixin Nie
Joey Tianyi Zhou
VLM
137
31
0
28 Sep 2022
Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Zhiyang Chen
Yousong Zhu
Zhaowen Li
Fan Yang
Wei Li
...
Chaoyang Zhao
Liwei Wu
Rui Zhao
Jinqiao Wang
Ming Tang
VLM
VOS
124
16
0
28 Sep 2022
Denoising of 3D MR images using a voxel-wise hybrid residual MLP-CNN model to improve small lesion diagnostic confidence
Haibo Yang
Shengjie Zhang
Xiaoyang Han
Botao Zhao
Yan-hua Ren
Yaru Sheng
Xiao-Yong Zhang
MedIm
90
8
0
28 Sep 2022
CourtNet for Infrared Small-Target Detection
Jingchao Peng
Haitao Zhao
Kaijie Zhao
Zhengwei Hu
Zhongze Wang
36
11
0
28 Sep 2022
Reconstruction-guided attention improves the robustness and shape processing of neural networks
Seoyoung Ahn
Hossein Adeli
G. Zelinsky
DiffM
AAML
65
1
0
27 Sep 2022
Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs
Ðorðe Miladinovic
Kumar Shridhar
Kushal Kumar Jain
Max B. Paulus
J. M. Buhmann
Mrinmaya Sachan
Carl Allen
DRL
99
5
0
26 Sep 2022
Self-supervised similarity models based on well-logging data
S. Egorov
Narek Gevorgyan
Alexey Zaytsev
SSL
64
5
0
26 Sep 2022
Collaboration of Pre-trained Models Makes Better Few-shot Learner
Renrui Zhang
Bohao Li
Wei Zhang
Hao Dong
Hongsheng Li
Peng Gao
Yu Qiao
VLM
114
7
0
25 Sep 2022
Multimodal Channel-Mixing: Channel and Spatial Masked AutoEncoder on Facial Action Unit Detection
Xiang Zhang
Huiyuan Yang
Taoyue Wang
Xiaotian Li
L. Yin
111
7
0
25 Sep 2022
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
VLM
185
365
0
25 Sep 2022
Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection
Neelu Madan
Nicolae-Cătălin Ristea
Radu Tudor Ionescu
Kamal Nasrollahi
Fahad Shahbaz Khan
T. Moeslund
M. Shah
ViT
MedIm
341
70
0
25 Sep 2022
Pretraining the Vision Transformer using self-supervised methods for vision based Deep Reinforcement Learning
Manuel Goulão
Arlindo L. Oliveira
ViT
109
6
0
22 Sep 2022
Lightweight Transformers for Human Activity Recognition on Mobile Devices
Sannara Ek
François Portet
P. Lalanda
83
32
0
22 Sep 2022
Understanding the Tricks of Deep Learning in Medical Image Segmentation: Challenges and Future Directions
Dong Zhang
Yi Lin
Hao Chen
Zhuotao Tian
Xin Yang
Jinhui Tang
Kwang-Ting Cheng
VLM
103
12
0
21 Sep 2022
PicT: A Slim Weakly Supervised Vision Transformer for Pavement Distress Classification
Wenhao Tang
Shengyue Huang
Xiaoxian Zhang
Luwen Huangfu
ViT
69
3
0
21 Sep 2022
MPC with Sensor-Based Online Cost Adaptation
Avadesh Meduri
Huaijiang Zhu
Armand Jordana
Ludovic Righetti
84
4
0
20 Sep 2022
NIERT: Accurate Numerical Interpolation through Unifying Scattered Data Representations using Transformer Encoder
Shi-qi Ding
Boyang Xia
Milong Ren
Dongbo Bu
60
3
0
19 Sep 2022
Attentive Symmetric Autoencoder for Brain MRI Segmentation
Junjia Huang
Haofeng Li
Guanbin Li
Xiang Wan
MedIm
55
17
0
19 Sep 2022
S
3
^3
3
R: Self-supervised Spectral Regression for Hyperspectral Histopathology Image Classification
Xingran Xie
Yan Wang
Qingli Li
96
5
0
19 Sep 2022
MetaMask: Revisiting Dimensional Confounder for Self-Supervised Learning
Jiangmeng Li
Jingyao Wang
Yanan Zhang
Wenyi Mo
Changwen Zheng
Fuchun Sun
Hui Xiong
SSL
112
14
0
16 Sep 2022
DBT-DMAE: An Effective Multivariate Time Series Pre-Train Model under Missing Data
Kai Zhang
Qinmin Yang
Chong Li
AI4TS
21
0
0
16 Sep 2022
Enhance the Visual Representation via Discrete Adversarial Training
Xiaofeng Mao
YueFeng Chen
Ranjie Duan
Yao Zhu
Gege Qi
Shaokai Ye
Xiaodan Li
Rong Zhang
Hui Xue
113
33
0
16 Sep 2022
Test-Time Training with Masked Autoencoders
Yossi Gandelsman
Yu Sun
Xinlei Chen
Alexei A. Efros
OOD
106
178
0
15 Sep 2022
On-Device Domain Generalization
Kaiyang Zhou
Yuanhan Zhang
Yuhang Zang
Jingkang Yang
Chen Change Loy
Ziwei Liu
OOD
125
7
0
15 Sep 2022
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
150
80
0
15 Sep 2022
BadRes: Reveal the Backdoors through Residual Connection
Min He
Tianyu Chen
Haoyi Zhou
Shanghang Zhang
Jianxin Li
59
1
0
15 Sep 2022
Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge
Zhihong Chen
Guanbin Li
Xiang Wan
178
73
0
15 Sep 2022
Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training
Zhihong Chen
Yu Du
Jinpeng Hu
Yang Liu
Guanbin Li
Xiang Wan
Tsung-Hui Chang
148
118
0
15 Sep 2022
Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?
Yi Wang
Zhiwen Fan
Tianlong Chen
Hehe Fan
Zhangyang Wang
ViT
107
10
0
15 Sep 2022
Pre-training for Information Retrieval: Are Hyperlinks Fully Explored?
Jiawen Wu
Xinyu Zhang
Yutao Zhu
Zheng Liu
Zikai Guo
Zhaoye Fei
Ruofei Lai
Yongkang Wu
Bo Zhao
Zhicheng Dou
82
5
0
14 Sep 2022
Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation
Maksym Yatsura
K. Sakmann
N. G. Hua
Matthias Hein
J. H. Metzen
AAML
107
18
0
13 Sep 2022
Graph Neural Networks for Molecules
Yuyang Wang
Zijie Li
A. Farimani
GNN
AI4CE
154
28
0
12 Sep 2022
Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe
Hongyang Li
Chonghao Sima
Jifeng Dai
Wenhai Wang
Lewei Lu
...
Xiaosong Jia
Siqian Liu
Jianping Shi
Dahua Lin
Yu Qiao
176
150
0
12 Sep 2022
Exploring Target Representations for Masked Autoencoders
Xingbin Liu
Jinghao Zhou
Tao Kong
Xianming Lin
Rongrong Ji
197
52
0
08 Sep 2022
Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations
Vadim Tschernezki
Iro Laina
Diane Larlus
Andrea Vedaldi
255
194
0
07 Sep 2022
MimCo: Masked Image Modeling Pre-training with Contrastive Teacher
Qiang-feng Zhou
Chaohui Yu
Haowen Luo
Zhibin Wang
Hao Li
VLM
132
21
0
07 Sep 2022
Prior Knowledge-Guided Attention in Self-Supervised Vision Transformers
Kevin Miao
Akash Gokul
Raghav Singh
Suzanne Petryk
Joseph E. Gonzalez
Kurt Keutzer
Trevor Darrell
Colorado Reed
ViT
MedIm
65
6
0
07 Sep 2022
Statistical Foundation Behind Machine Learning and Its Impact on Computer Vision
Lei Zhang
H. Shum
VLM
SSL
65
2
0
06 Sep 2022
ViTKD: Practical Guidelines for ViT feature knowledge distillation
Zhendong Yang
Zhe Li
Ailing Zeng
Zexian Li
Chun Yuan
Yu Li
145
42
0
06 Sep 2022
Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop Generation
Yen-Tung Yeh
Bo-Yu Chen
Yi-Hsuan Yang
75
6
0
05 Sep 2022
Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomography
Thomas Z. Li
Kaiwen Xu
Riqiang Gao
Yucheng Tang
Thomas A. Lasko
Fabien Maldonado
K. Sandler
Bennett A. Landman
ViT
MedIm
41
15
0
04 Sep 2022
Previous
1
2
3
...
85
86
87
...
94
95
96
Next