ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,777 papers shown
Title
KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular
  Property Prediction
KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction
Han Li
Dan Zhao
Jianyang Zeng
82
64
0
02 Jun 2022
MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining
MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining
Pengyuan Lyu
Chengquan Zhang
Shanshan Liu
Meina Qiao
Yangliu Xu
Liang Wu
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
119
43
0
01 Jun 2022
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Jun Chen
Ming Hu
Boyang Albert Li
Mohamed Elhoseiny
146
37
0
01 Jun 2022
CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping
CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping
Junlin Han
L. Petersson
Hongdong Li
Ian Reid
86
9
0
31 May 2022
Few-Shot Diffusion Models
Few-Shot Diffusion Models
Giorgio Giannone
Didrik Nielsen
Ole Winther
DiffM
231
51
0
30 May 2022
Self-Supervised Visual Representation Learning with Semantic Grouping
Self-Supervised Visual Representation Learning with Semantic Grouping
Xin Wen
Bingchen Zhao
Anlin Zheng
Xinming Zhang
Xiaojuan Qi
SSL
219
74
0
30 May 2022
Chefs' Random Tables: Non-Trigonometric Random Features
Chefs' Random Tables: Non-Trigonometric Random Features
Valerii Likhosherstov
K. Choromanski
Kumar Avinava Dubey
Frederick Liu
Tamás Sarlós
Adrian Weller
90
18
0
30 May 2022
GMML is All you Need
GMML is All you Need
Sara Atito
Muhammad Awais
J. Kittler
ViTVLM
87
18
0
30 May 2022
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Muning Wen
J. Kuba
Runji Lin
Weinan Zhang
Ying Wen
Jun Wang
Yaodong Yang
105
193
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
104
29
0
30 May 2022
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor
  Embedding
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding
Tianyang Hu
Zhili Liu
Fengwei Zhou
Wei Cao
Weiran Huang
SSL
105
28
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing
  Mechanisms in Sequence Learning
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
200
18
0
30 May 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSLTPMViT
108
23
0
28 May 2022
MDMLP: Image Classification from Scratch on Small Datasets with MLP
MDMLP: Image Classification from Scratch on Small Datasets with MLP
Tianxu Lv
Chongyang Bai
Chaojie Wang
66
6
0
28 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian Sun
Weiming Hu
ViT
148
46
0
28 May 2022
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud
  Pre-training
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Renrui Zhang
Ziyu Guo
Rongyao Fang
Bingyan Zhao
Dong Wang
Yu Qiao
Hongsheng Li
Peng Gao
3DPC
258
262
0
28 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViTOCL
75
15
0
28 May 2022
Multimodal Masked Autoencoders Learn Transferable Representations
Multimodal Masked Autoencoders Learn Transferable Representations
Xinyang Geng
Hao Liu
Lisa Lee
Dale Schuurams
Sergey Levine
Pieter Abbeel
93
119
0
27 May 2022
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via
  Feature Distillation
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
Yixuan Wei
Han Hu
Zhenda Xie
Zheng Zhang
Yue Cao
Jianmin Bao
Dong Chen
B. Guo
CLIP
158
128
0
27 May 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
172
562
0
27 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
84
49
0
27 May 2022
Transformer for Partial Differential Equations' Operator Learning
Transformer for Partial Differential Equations' Operator Learning
Zijie Li
Kazem Meidani
A. Farimani
117
172
0
26 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual
  Recognition
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
241
703
0
26 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
125
72
0
26 May 2022
HIRL: A General Framework for Hierarchical Image Representation Learning
HIRL: A General Framework for Hierarchical Image Representation Learning
Minghao Xu
Yuanfan Guo
Xuanyu Zhu
Jiawen Li
Zhenbang Sun
Jiangtao Tang
Yi Xu
Bingbing Ni
SSL
32
3
0
26 May 2022
Matryoshka Representation Learning
Matryoshka Representation Learning
Aditya Kusupati
Gantavya Bhatt
Aniket Rege
Matthew Wallingford
Aditya Sinha
...
William Howard-Snyder
Kaifeng Chen
Sham Kakade
Prateek Jain
Ali Farhadi
150
90
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of
  Hierarchical Vision Transformers
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
67
55
0
26 May 2022
Pretraining is All You Need for Image-to-Image Translation
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
265
181
0
25 May 2022
An Empirical Study on Distribution Shift Robustness From the Perspective
  of Pre-Training and Data Augmentation
An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation
Ziquan Liu
Yi Tian Xu
Yuanhong Xu
Qi Qian
Hao Li
Rong Jin
Xiangyang Ji
Antoni B. Chan
OOD
89
16
0
25 May 2022
Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision
  Transformers
Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers
Bin Ren
Yahui Liu
Yue Song
Wei Bi
Rita Cucchiara
N. Sebe
Wei Wang
122
28
0
25 May 2022
Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning
Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning
Chong Ma
Lin Zhao
Yuzhong Chen
Lu Zhang
Zhe Xiao
...
Tuo Zhang
Qian Wang
Dinggang Shen
Dajiang Zhu
Tianming Liu
ViTMedIm
105
30
0
25 May 2022
BolT: Fused Window Transformers for fMRI Time Series Analysis
BolT: Fused Window Transformers for fMRI Time Series Analysis
H. Bedel
Irmak Sivgin
Onat Dalmaz
S. Dar
Tolga Çukur
130
59
0
23 May 2022
Contrastive and Non-Contrastive Self-Supervised Learning Recover Global
  and Local Spectral Embedding Methods
Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods
Randall Balestriero
Yann LeCun
SSL
112
135
0
23 May 2022
Decoder Denoising Pretraining for Semantic Segmentation
Decoder Denoising Pretraining for Semantic Segmentation
Emmanuel B. Asiedu
Simon Kornblith
Ting Chen
Niki Parmar
Matthias Minderer
Mohammad Norouzi
AI4CE
262
27
0
23 May 2022
Continual Barlow Twins: continual self-supervised learning for remote
  sensing semantic segmentation
Continual Barlow Twins: continual self-supervised learning for remote sensing semantic segmentation
V. Marsocci
Simone Scardapane
CLL
95
27
0
23 May 2022
PEVL: Position-enhanced Pre-training and Prompt Tuning for
  Vision-language Models
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
Yuan Yao
Qi-An Chen
Ao Zhang
Wei Ji
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
VLMMLLM
93
38
0
23 May 2022
FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders
FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders
Kaidi Wang
Bo Zhao
Xiangyu Peng
Zheng Hua Zhu
Jiankang Deng
Xinchao Wang
Hakan Bilen
Yang You
PICV
121
11
0
23 May 2022
GraphMAE: Self-Supervised Masked Graph Autoencoders
GraphMAE: Self-Supervised Masked Graph Autoencoders
Zhenyu Hou
Xiao Liu
Yukuo Cen
Yuxiao Dong
Hongxia Yang
C. Wang
Jie Tang
SSL
141
593
0
22 May 2022
AutoLink: Self-supervised Learning of Human Skeletons and Object
  Outlines by Linking Keypoints
AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints
Xingzhe He
Bastian Wandt
Helge Rhodin
SSL3DH3DPC
168
18
0
21 May 2022
Improvements to Self-Supervised Representation Learning for Masked Image
  Modeling
Improvements to Self-Supervised Representation Learning for Masked Image Modeling
Jia-ju Mao
Xuesong Yin
Yuan Chang
Honggu Zhou
SSL
47
1
0
21 May 2022
A Study on Transformer Configuration and Training Objective
A Study on Transformer Configuration and Training Objective
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
87
9
0
21 May 2022
Self-supervised 3D anatomy segmentation using self-distilled masked
  image transformer (SMIT)
Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)
Jue Jiang
N. Tyagi
K. Tringale
C. Crane
Harini Veeraraghavan
MedIm
104
37
0
20 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision
  Transformers with Locality
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
179
75
0
20 May 2022
What's Behind the Mask: Understanding Masked Graph Modeling for Graph
  Autoencoders
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
Jintang Li
Ruofan Wu
Wangbin Sun
Liang Chen
Sheng Tian
Liang Zhu
Changhua Meng
Zibin Zheng
Weiqiang Wang
SSL
102
96
0
20 May 2022
Mask-guided Vision Transformer (MG-ViT) for Few-Shot Learning
Mask-guided Vision Transformer (MG-ViT) for Few-Shot Learning
Yuzhong Chen
Zhe Xiao
Lin Zhao
Lu Zhang
Haixing Dai
...
Tuo Zhang
Changying Li
Dajiang Zhu
Tianming Liu
Xi Jiang
112
18
0
20 May 2022
Self-Supervised Time Series Representation Learning via Cross
  Reconstruction Transformer
Self-Supervised Time Series Representation Learning via Cross Reconstruction Transformer
Wen-Rang Zhang
Ling Yang
Shijia Geng
Shenda Hong
ViTAI4TS
94
44
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
75
54
0
19 May 2022
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual
  Object Detection
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection
Feng Liu
Xiaosong Zhang
Zhiliang Peng
Zonghao Guo
Fang Wan
Xian-Wei Ji
QiXiang Ye
ObjD
104
21
0
19 May 2022
TransTab: Learning Transferable Tabular Transformers Across Tables
TransTab: Learning Transferable Tabular Transformers Across Tables
Zifeng Wang
Jimeng Sun
LMTD
85
151
0
19 May 2022
Training Vision-Language Transformers from Captions
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLMViT
203
11
0
19 May 2022
Previous
123...909192...949596
Next