ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViT
    TPM
ArXivPDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,611 papers shown
Title
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
27
527
0
27 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
32
48
0
27 May 2022
Transformer for Partial Differential Equations' Operator Learning
Transformer for Partial Differential Equations' Operator Learning
Zijie Li
Kazem Meidani
A. Farimani
42
140
0
26 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual
  Recognition
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
141
637
0
26 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
22
68
0
26 May 2022
HIRL: A General Framework for Hierarchical Image Representation Learning
HIRL: A General Framework for Hierarchical Image Representation Learning
Minghao Xu
Yuanfan Guo
Xuanyu Zhu
Jiawen Li
Zhenbang Sun
Jiangtao Tang
Yi Xu
Bingbing Ni
SSL
6
3
0
26 May 2022
Matryoshka Representation Learning
Matryoshka Representation Learning
Aditya Kusupati
Gantavya Bhatt
Aniket Rege
Matthew Wallingford
Aditya Sinha
...
William Howard-Snyder
Kaifeng Chen
Sham Kakade
Prateek Jain
Ali Farhadi
26
73
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of
  Hierarchical Vision Transformers
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
25
53
0
26 May 2022
Pretraining is All You Need for Image-to-Image Translation
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
187
178
0
25 May 2022
An Empirical Study on Distribution Shift Robustness From the Perspective
  of Pre-Training and Data Augmentation
An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation
Ziquan Liu
Yi Tian Xu
Yuanhong Xu
Qi Qian
Hao Li
Rong Jin
Xiangyang Ji
Antoni B. Chan
OOD
37
14
0
25 May 2022
Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision
  Transformers
Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers
Bin Ren
Yahui Liu
Yue Song
Wei Bi
Rita Cucchiara
N. Sebe
Wei Wang
46
21
0
25 May 2022
Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning
Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning
Chong Ma
Lin Zhao
Yuzhong Chen
Lu Zhang
Zhe Xiao
...
Tuo Zhang
Qian Wang
Dinggang Shen
Dajiang Zhu
Tianming Liu
ViT
MedIm
34
30
0
25 May 2022
BolT: Fused Window Transformers for fMRI Time Series Analysis
BolT: Fused Window Transformers for fMRI Time Series Analysis
H. Bedel
Irmak Sivgin
Onat Dalmaz
S. Dar
Tolga Çukur
51
54
0
23 May 2022
Contrastive and Non-Contrastive Self-Supervised Learning Recover Global
  and Local Spectral Embedding Methods
Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods
Randall Balestriero
Yann LeCun
SSL
11
129
0
23 May 2022
Decoder Denoising Pretraining for Semantic Segmentation
Decoder Denoising Pretraining for Semantic Segmentation
Emmanuel B. Asiedu
Simon Kornblith
Ting Chen
Niki Parmar
Matthias Minderer
Mohammad Norouzi
AI4CE
191
26
0
23 May 2022
Continual Barlow Twins: continual self-supervised learning for remote
  sensing semantic segmentation
Continual Barlow Twins: continual self-supervised learning for remote sensing semantic segmentation
V. Marsocci
Simone Scardapane
CLL
28
25
0
23 May 2022
PEVL: Position-enhanced Pre-training and Prompt Tuning for
  Vision-language Models
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
Yuan Yao
Qi-An Chen
Ao Zhang
Wei Ji
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
VLM
MLLM
21
38
0
23 May 2022
FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders
FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders
K. Wang
Bo-Lu Zhao
Xiangyu Peng
Zheng Hua Zhu
Jiankang Deng
Xinchao Wang
Hakan Bilen
Yang You
PICV
38
11
0
23 May 2022
GraphMAE: Self-Supervised Masked Graph Autoencoders
GraphMAE: Self-Supervised Masked Graph Autoencoders
Zhenyu Hou
Xiao Liu
Yukuo Cen
Yuxiao Dong
Hongxia Yang
C. Wang
Jie Tang
SSL
42
544
0
22 May 2022
AutoLink: Self-supervised Learning of Human Skeletons and Object
  Outlines by Linking Keypoints
AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints
Xingzhe He
Bastian Wandt
Helge Rhodin
SSL
3DH
3DPC
32
18
0
21 May 2022
Improvements to Self-Supervised Representation Learning for Masked Image
  Modeling
Improvements to Self-Supervised Representation Learning for Masked Image Modeling
Jia-ju Mao
Xuesong Yin
Yuan Chang
Honggu Zhou
SSL
19
1
0
21 May 2022
A Study on Transformer Configuration and Training Objective
A Study on Transformer Configuration and Training Objective
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
30
7
0
21 May 2022
Self-supervised 3D anatomy segmentation using self-distilled masked
  image transformer (SMIT)
Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)
Jue Jiang
N. Tyagi
K. Tringale
C. Crane
H. Veeraraghavan
MedIm
25
34
0
20 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision
  Transformers with Locality
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
102
73
0
20 May 2022
What's Behind the Mask: Understanding Masked Graph Modeling for Graph
  Autoencoders
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
Jintang Li
Ruofan Wu
Wangbin Sun
Liang Chen
Sheng Tian
Liang Zhu
Changhua Meng
Zibin Zheng
Weiqiang Wang
SSL
16
79
0
20 May 2022
Mask-guided Vision Transformer (MG-ViT) for Few-Shot Learning
Mask-guided Vision Transformer (MG-ViT) for Few-Shot Learning
Yuzhong Chen
Zhe Xiao
Lin Zhao
Lu Zhang
Haixing Dai
...
Tuo Zhang
Changying Li
Dajiang Zhu
Tianming Liu
Xi Jiang
36
18
0
20 May 2022
Self-Supervised Time Series Representation Learning via Cross
  Reconstruction Transformer
Self-Supervised Time Series Representation Learning via Cross Reconstruction Transformer
Wen-Rang Zhang
Ling Yang
Shijia Geng
Shenda Hong
ViT
AI4TS
37
40
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
30
51
0
19 May 2022
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual
  Object Detection
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection
Feng Liu
Xiaosong Zhang
Zhiliang Peng
Zonghao Guo
Fang Wan
Xian-Wei Ji
QiXiang Ye
ObjD
43
20
0
19 May 2022
TransTab: Learning Transferable Tabular Transformers Across Tables
TransTab: Learning Transferable Tabular Transformers Across Tables
Zifeng Wang
Jimeng Sun
LMTD
23
135
0
19 May 2022
Training Vision-Language Transformers from Captions
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLM
ViT
172
11
0
19 May 2022
Global Contrast Masked Autoencoders Are Powerful Pathological
  Representation Learners
Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners
Hao Quan
Xingyu Li
Weixing Chen
Qun Bai
Mingchen Zou
Ruijie Yang
Tingting Zheng
R. Qi
Xin Gao
Xiaoyu Cui
MedIm
23
19
0
18 May 2022
Label-Efficient Self-Supervised Federated Learning for Tackling Data
  Heterogeneity in Medical Imaging
Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging
Rui Yan
Liangqiong Qu
Qingyue Wei
Shih-Cheng Huang
Liyue Shen
D. Rubin
Lei Xing
Yuyin Zhou
FedML
70
89
0
17 May 2022
Text Detection & Recognition in the Wild for Robot Localization
Text Detection & Recognition in the Wild for Robot Localization
Z. Raisi
John S. Zelek
14
0
0
17 May 2022
Vision Transformer Adapter for Dense Predictions
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
43
541
0
17 May 2022
Transformers in 3D Point Clouds: A Survey
Transformers in 3D Point Clouds: A Survey
Dening Lu
Qian Xie
Mingqiang Wei
Kyle Gao
Linlin Xu
Jonathan Li
3DPC
ViT
32
49
0
16 May 2022
Learning Representations for New Sound Classes With Continual
  Self-Supervised Learning
Learning Representations for New Sound Classes With Continual Self-Supervised Learning
Zhepei Wang
Cem Subakan
Xilin Jiang
Junkai Wu
Efthymios Tzinis
Mirco Ravanelli
Paris Smaragdis
CLL
SSL
57
19
0
15 May 2022
Incorporating Prior Knowledge into Neural Networks through an Implicit
  Composite Kernel
Incorporating Prior Knowledge into Neural Networks through an Implicit Composite Kernel
Ziyang Jiang
Tongshu Zheng
Yiling Liu
David Carlson
15
4
0
15 May 2022
ETAD: Training Action Detection End to End on a Laptop
ETAD: Training Action Detection End to End on a Laptop
Shuming Liu
Mengmeng Xu
Chen Zhao
Xu Zhao
Bernard Ghanem
44
6
0
14 May 2022
A Comprehensive Survey of Few-shot Learning: Evolution, Applications,
  Challenges, and Opportunities
A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities
Yisheng Song
Ting-Yuan Wang
S. Mondal
J. P. Sahoo
SLR
36
342
0
13 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
19
34
0
12 May 2022
One Model, Multiple Modalities: A Sparsely Activated Approach for Text,
  Sound, Image, Video and Code
One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code
Yong Dai
Duyu Tang
Liangxin Liu
Minghuan Tan
Cong Zhou
Jingquan Wang
Zhangyin Feng
Fan Zhang
Xueyu Hu
Shuming Shi
VLM
MoE
21
26
0
12 May 2022
CV4Code: Sourcecode Understanding via Visual Code Representations
CV4Code: Sourcecode Understanding via Visual Code Representations
Ruibo Shi
Lili Tao
Rohan Saphal
Fran Silavong
Sean J. Moran
21
0
0
11 May 2022
AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation
AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation
Xu Cao
Xiaoye Li
Liya Ma
Yi Huang
X. Feng
Zening Chen
H. Zeng
Jianguo Cao
ViT
11
21
0
11 May 2022
Multiplexed Immunofluorescence Brain Image Analysis Using
  Self-Supervised Dual-Loss Adaptive Masked Autoencoder
Multiplexed Immunofluorescence Brain Image Analysis Using Self-Supervised Dual-Loss Adaptive Masked Autoencoder
S. Ly
Bai Lin
Hung Q. Vo
D. Maric
B. Roysam
H. V. Nguyen
26
0
0
10 May 2022
Reconstruction Enhanced Multi-View Contrastive Learning for Anomaly
  Detection on Attributed Networks
Reconstruction Enhanced Multi-View Contrastive Learning for Anomaly Detection on Attributed Networks
Jiaqiang Zhang
Senzhang Wang
Songcan Chen
11
44
0
10 May 2022
Domain Invariant Masked Autoencoders for Self-supervised Learning from
  Multi-domains
Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains
Haiyang Yang
Meilin Chen
Yizhou Wang
Shixiang Tang
Feng Zhu
Lei Bai
Rui Zhao
Wanli Ouyang
16
16
0
10 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
59
600
0
09 May 2022
Anatomy-aware Self-supervised Learning for Anomaly Detection in Chest
  Radiographs
Anatomy-aware Self-supervised Learning for Anomaly Detection in Chest Radiographs
Junya Sato
Yuki Suzuki
T. Wataya
Daiki Nishigaki
Kosuke Kita
Kazuki Yamagata
Noriyuki Tomiyama
Shoji Kido
13
14
0
09 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
ConvMAE: Masked Convolution Meets Masked Autoencoders
Peng Gao
Teli Ma
Hongsheng Li
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
19
121
0
08 May 2022
Previous
123...878889...919293
Next