ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,777 papers shown
Title
Global Contrast Masked Autoencoders Are Powerful Pathological
  Representation Learners
Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners
Hao Quan
Xingyu Li
Weixing Chen
Qun Bai
Mingchen Zou
Ruijie Yang
Tingting Zheng
R. Qi
Xin Gao
Xiaoyu Cui
MedIm
110
21
0
18 May 2022
Label-Efficient Self-Supervised Federated Learning for Tackling Data
  Heterogeneity in Medical Imaging
Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging
Rui Yan
Liangqiong Qu
Qingyue Wei
Shih-Cheng Huang
Liyue Shen
D. Rubin
Lei Xing
Yuyin Zhou
FedML
153
100
0
17 May 2022
Text Detection & Recognition in the Wild for Robot Localization
Text Detection & Recognition in the Wild for Robot Localization
Z. Raisi
John S. Zelek
68
0
0
17 May 2022
Vision Transformer Adapter for Dense Predictions
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
180
571
0
17 May 2022
Transformers in 3D Point Clouds: A Survey
Transformers in 3D Point Clouds: A Survey
Dening Lu
Qian Xie
Mingqiang Wei
Kyle Gao
Linlin Xu
Jonathan Li
3DPCViT
139
53
0
16 May 2022
Learning Representations for New Sound Classes With Continual
  Self-Supervised Learning
Learning Representations for New Sound Classes With Continual Self-Supervised Learning
Zhepei Wang
Cem Subakan
Xilin Jiang
Junkai Wu
Efthymios Tzinis
Mirco Ravanelli
Paris Smaragdis
CLLSSL
123
19
0
15 May 2022
Incorporating Prior Knowledge into Neural Networks through an Implicit
  Composite Kernel
Incorporating Prior Knowledge into Neural Networks through an Implicit Composite Kernel
Ziyang Jiang
Tongshu Zheng
Yiling Liu
David Carlson
73
4
0
15 May 2022
ETAD: Training Action Detection End to End on a Laptop
ETAD: Training Action Detection End to End on a Laptop
Shuming Liu
Mengmeng Xu
Chen Zhao
Xu Zhao
Guohao Li
78
7
0
14 May 2022
A Comprehensive Survey of Few-shot Learning: Evolution, Applications,
  Challenges, and Opportunities
A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities
Yisheng Song
Ting-Yuan Wang
S. Mondal
J. P. Sahoo
SLR
128
387
0
13 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
116
35
0
12 May 2022
One Model, Multiple Modalities: A Sparsely Activated Approach for Text,
  Sound, Image, Video and Code
One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code
Yong Dai
Duyu Tang
Liangxin Liu
Minghuan Tan
Cong Zhou
Jingquan Wang
Zhangyin Feng
Fan Zhang
Xueyu Hu
Shuming Shi
VLMMoE
83
26
0
12 May 2022
CV4Code: Sourcecode Understanding via Visual Code Representations
CV4Code: Sourcecode Understanding via Visual Code Representations
Ruibo Shi
Lili Tao
Rohan Saphal
Fran Silavong
Sean J. Moran
42
0
0
11 May 2022
AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation
AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation
Xu Cao
Xiaoye Li
Liya Ma
Yi Huang
X. Feng
Zening Chen
H. Zeng
Jianguo Cao
ViT
63
21
0
11 May 2022
Multiplexed Immunofluorescence Brain Image Analysis Using
  Self-Supervised Dual-Loss Adaptive Masked Autoencoder
Multiplexed Immunofluorescence Brain Image Analysis Using Self-Supervised Dual-Loss Adaptive Masked Autoencoder
S. Ly
Bai Lin
Hung Q. Vo
D. Maric
B. Roysam
H. V. Nguyen
62
0
0
10 May 2022
Reconstruction Enhanced Multi-View Contrastive Learning for Anomaly
  Detection on Attributed Networks
Reconstruction Enhanced Multi-View Contrastive Learning for Anomaly Detection on Attributed Networks
Jiaqiang Zhang
Senzhang Wang
Songcan Chen
67
53
0
10 May 2022
Domain Invariant Masked Autoencoders for Self-supervised Learning from
  Multi-domains
Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains
Haiyang Yang
Meilin Chen
Yizhou Wang
Shixiang Tang
Feng Zhu
Lei Bai
Rui Zhao
Wanli Ouyang
73
19
0
10 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
174
646
0
09 May 2022
Anatomy-aware Self-supervised Learning for Anomaly Detection in Chest
  Radiographs
Anatomy-aware Self-supervised Learning for Anomaly Detection in Chest Radiographs
Junya Sato
Yuki Suzuki
T. Wataya
Daiki Nishigaki
Kosuke Kita
Kazuki Yamagata
Noriyuki Tomiyama
Shoji Kido
56
14
0
09 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
ConvMAE: Masked Convolution Meets Masked Autoencoders
Peng Gao
Teli Ma
Hongsheng Li
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
79
128
0
08 May 2022
Automatic segmentation of meniscus based on MAE self-supervision and
  point-line weak supervision paradigm
Automatic segmentation of meniscus based on MAE self-supervision and point-line weak supervision paradigm
Yuhan Xie
Kexin Jiang
Zhiyong Zhang
Shaolong Chen
Xiaodong Zhang
Changzhen Qiu
94
1
0
07 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision
  Transformers
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
Łukasz Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
94
197
0
06 May 2022
MINI: Mining Implicit Novel Instances for Few-Shot Object Detection
MINI: Mining Implicit Novel Instances for Few-Shot Object Detection
Yuhang Cao
Jiaqi Wang
Yiqi Lin
Dahua Lin
ObjD
87
5
0
06 May 2022
BlobGAN: Spatially Disentangled Scene Representations
BlobGAN: Spatially Disentangled Scene Representations
Dave Epstein
Taesung Park
Richard Y. Zhang
Eli Shechtman
Alexei A. Efros
GANSSLOCL
99
43
0
05 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLMCLIPOffRL
290
1,312
0
04 May 2022
Better plain ViT baselines for ImageNet-1k
Better plain ViT baselines for ImageNet-1k
Lucas Beyer
Xiaohua Zhai
Alexander Kolesnikov
ViTVLM
103
118
0
03 May 2022
Data Determines Distributional Robustness in Contrastive Language Image
  Pre-training (CLIP)
Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP)
Alex Fang
Gabriel Ilharco
Mitchell Wortsman
Yu Wan
Vaishaal Shankar
Achal Dave
Ludwig Schmidt
VLMOOD
108
149
0
03 May 2022
Engineering flexible machine learning systems by traversing
  functionally-invariant paths
Engineering flexible machine learning systems by traversing functionally-invariant paths
G. Raghavan
Bahey Tharwat
S. N. Hari
Dhruvil Satani
Matt Thomson
OODAI4CE
42
8
0
30 Apr 2022
StorSeismic: A new paradigm in deep learning for seismic processing
StorSeismic: A new paradigm in deep learning for seismic processing
R. Harsuko
T. Alkhalifah
69
38
0
30 Apr 2022
Unsupervised Contrastive Learning based Transformer for Lung Nodule
  Detection
Unsupervised Contrastive Learning based Transformer for Lung Nodule Detection
Chuang Niu
Ge Wang
ViTMedIm
89
37
0
30 Apr 2022
PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model
  Pretraining
PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining
Yuting Gao
Jinfeng Liu
Zihan Xu
Jinchao Zhang
Ke Li
Rongrong Ji
Chunhua Shen
VLMCLIP
131
104
0
29 Apr 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical
  Transformers
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
156
335
0
28 Apr 2022
AE-NeRF: Auto-Encoding Neural Radiance Fields for 3D-Aware Object
  Manipulation
AE-NeRF: Auto-Encoding Neural Radiance Fields for 3D-Aware Object Manipulation
Mira Kim
Jaehoon Ko
Kyusun Cho
J. Choi
Daewon Choi
Seung Wook Kim
72
4
0
28 Apr 2022
Towards Flexible Inference in Sequential Decision Problems via
  Bidirectional Transformers
Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Micah Carroll
Jessy Lin
Orr Paradise
Raluca Georgescu
Mingfei Sun
...
Stephanie Milani
Katja Hofmann
Matthew J. Hausknecht
Anca Dragan
Sam Devlin
OffRL
122
10
0
28 Apr 2022
Self-Supervised Learning of Object Parts for Semantic Segmentation
Self-Supervised Learning of Object Parts for Semantic Segmentation
A. Ziegler
Yuki M. Asano
SSLOCL
117
103
0
27 Apr 2022
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Dading Chong
Helin Wang
Peilin Zhou
Qingcheng Zeng
79
68
0
27 Apr 2022
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
ViT
129
541
0
26 Apr 2022
Understanding The Robustness in Vision Transformers
Understanding The Robustness in Vision Transformers
Daquan Zhou
Zhiding Yu
Enze Xie
Chaowei Xiao
Anima Anandkumar
Jiashi Feng
J. Álvarez
ViT
154
193
0
26 Apr 2022
MILES: Visual BERT Pre-training with Injected Language Semantics for
  Video-text Retrieval
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval
Yuying Ge
Yixiao Ge
Xihui Liu
Alex Jinpeng Wang
Jianping Wu
Ying Shan
Xiaohu Qie
Ping Luo
VLM
81
44
0
26 Apr 2022
Masked Spectrogram Modeling using Masked Autoencoders for Learning
  General-purpose Audio Representation
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
94
69
0
26 Apr 2022
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Rui Tian
Zuxuan Wu
Qi Dai
Han Hu
Yu-Gang Jiang
ViTAAML
104
6
0
26 Apr 2022
Masked Image Modeling Advances 3D Medical Image Analysis
Masked Image Modeling Advances 3D Medical Image Analysis
Zekai Chen
Devansh Agarwal
Kshitij Aggarwal
Wiem Safta
Samit Hirawat
V. Sethuraman
Mariann Micsinai Balan
Kevin Brown
81
74
0
25 Apr 2022
A Survey on Unsupervised Anomaly Detection Algorithms for Industrial
  Images
A Survey on Unsupervised Anomaly Detection Algorithms for Industrial Images
Yajie Cui
Zhaoxiang Liu
Kai Wang
OODDRL
110
47
0
24 Apr 2022
A Mask-Based Adversarial Defense Scheme
A Mask-Based Adversarial Defense Scheme
Weizhen Xu
Chenyi Zhang
Fangzhen Zhao
Liangda Fang
AAML
77
3
0
21 Apr 2022
Progressive Training of A Two-Stage Framework for Video Restoration
Progressive Training of A Two-Stage Framework for Video Restoration
Mei Zheng
Qunliang Xing
Minglang Qiao
Mai Xu
Lai Jiang
Huaida Liu
Ying-Cong Chen
89
11
0
21 Apr 2022
A Masked Image Reconstruction Network for Document-level Relation
  Extraction
A Masked Image Reconstruction Network for Document-level Relation Extraction
Li Zhang
Yidong Cheng
63
2
0
21 Apr 2022
Neuro-BERT: Rethinking Masked Autoencoding for Self-supervised
  Neurological Pretraining
Neuro-BERT: Rethinking Masked Autoencoding for Self-supervised Neurological Pretraining
Di Wu
Siyuan Li
Jie Yang
Mohamad Sawan
SSL
76
15
0
20 Apr 2022
Disentangling Spatial-Temporal Functional Brain Networks via
  Twin-Transformers
Disentangling Spatial-Temporal Functional Brain Networks via Twin-Transformers
Xiao-Wen Yu
Lu Zhang
Lin Zhao
Yanjun Lyu
Tianming Liu
Dajiang Zhu
51
10
0
20 Apr 2022
Diverse Imagenet Models Transfer Better
Diverse Imagenet Models Transfer Better
Niv Nayman
A. Golbert
Asaf Noy
Tan Ping
Lihi Zelnik-Manor
73
0
0
19 Apr 2022
Missingness Bias in Model Debugging
Missingness Bias in Model Debugging
Saachi Jain
Hadi Salman
E. Wong
Pengchuan Zhang
Vibhav Vineet
Sai H. Vemprala
Aleksander Madry
95
37
0
19 Apr 2022
SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic
  Segmentation
SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic Segmentation
Binhui Xie
Shuang Li
Mingjiang Li
Chi Harold Liu
Gao Huang
Guoren Wang
106
150
0
19 Apr 2022
Previous
123...919293949596
Next