ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,777 papers shown
Title
A Dual-Masked Auto-Encoder for Robust Motion Capture with
  Spatial-Temporal Skeletal Token Completion
A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion
Junkun Jiang
Jie Chen
Yike Guo
3DH
54
10
0
15 Jul 2022
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
89
78
0
14 Jul 2022
Benchmarking Omni-Vision Representation through the Lens of Visual
  Realms
Benchmarking Omni-Vision Representation through the Lens of Visual Realms
Yuanhan Zhang
Zhen-fei Yin
Jing Shao
Ziwei Liu
VLM
112
29
0
14 Jul 2022
Convolutional Bypasses Are Better Vision Transformer Adapters
Convolutional Bypasses Are Better Vision Transformer Adapters
Shibo Jie
Zhi-Hong Deng
VPVLM
91
137
0
14 Jul 2022
Language Modelling with Pixels
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
107
46
0
14 Jul 2022
iColoriT: Towards Propagating Local Hint to the Right Region in
  Interactive Colorization by Leveraging Vision Transformer
iColoriT: Towards Propagating Local Hint to the Right Region in Interactive Colorization by Leveraging Vision Transformer
Jooyeol Yun
Sanghyeon Lee
Minho Park
Jaegul Choo
ViT
50
2
0
14 Jul 2022
Masked Autoencoders that Listen
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
142
290
0
13 Jul 2022
Global-local Motion Transformer for Unsupervised Skeleton-based Action
  Learning
Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning
Boeun Kim
H. Chang
Jungho Kim
J. Choi
ViT
87
52
0
13 Jul 2022
DSPNet: Towards Slimmable Pretrained Networks based on Discriminative
  Self-supervised Learning
DSPNet: Towards Slimmable Pretrained Networks based on Discriminative Self-supervised Learning
Shaoru Wang
Zeming Li
Jin Gao
Liang Li
Weiming Hu
61
0
0
13 Jul 2022
Dateformer: Time-modeling Transformer for Longer-term Series Forecasting
Dateformer: Time-modeling Transformer for Longer-term Series Forecasting
Julong Young
Junhui Chen
Feihu Huang
Jian Peng
AI4TS
29
1
0
12 Jul 2022
Occluded Human Body Capture with Self-Supervised Spatial-Temporal Motion
  Prior
Occluded Human Body Capture with Self-Supervised Spatial-Temporal Motion Prior
Buzhen Huang
Y. Shu
Jingyi Ju
Yangang Wang
3DH
82
12
0
12 Jul 2022
Outpainting by Queries
Outpainting by Queries
Kai Yao
Penglei Gao
Xi Yang
Kaizhu Huang
Jie Sun
Rui Zhang
ViT
87
13
0
12 Jul 2022
Unsupervised Semantic Segmentation with Self-supervised Object-centric
  Representations
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
Andrii Zadaianchuk
Matthaeus Kleindessner
Yi Zhu
Francesco Locatello
Thomas Brox
130
53
0
11 Jul 2022
Synergy and Symmetry in Deep Learning: Interactions between the Data,
  Model, and Inference Algorithm
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
Lechao Xiao
Jeffrey Pennington
101
10
0
11 Jul 2022
Brain-Aware Replacements for Supervised Contrastive Learning in
  Detection of Alzheimer's Disease
Brain-Aware Replacements for Supervised Contrastive Learning in Detection of Alzheimer's Disease
Mehmet Saygin Seyfiouglu
Zixuan Liu
Pranav Kamath
Sadjyot Gangolli
Sheng Wang
T. Grabowski
Linda G. Shapiro
MedIm
90
15
0
11 Jul 2022
Domain Alignment Meets Fully Test-Time Adaptation
Domain Alignment Meets Fully Test-Time Adaptation
Kowshik Thopalli
Pavan Turaga
Jayaraman J. Thiagarajan
OODTTA
52
4
0
09 Jul 2022
Big Learning
Big Learning
Yulai Cong
Miaoyun Zhao
AI4CE
94
0
0
08 Jul 2022
Pixel-level Correspondence for Self-Supervised Learning from Video
Pixel-level Correspondence for Self-Supervised Learning from Video
Yash Sharma
Yi Zhu
Chris Russell
Thomas Brox
SSL
49
4
0
08 Jul 2022
Consecutive Pretraining: A Knowledge Transfer Learning Strategy with
  Relevant Unlabeled Data for Remote Sensing Domain
Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain
Tong Zhang
Peng Gao
Hao-Chen Dong
Zhuang Yin
Guanqun Wang
Wei Zhang
He Chen
77
34
0
08 Jul 2022
Revisiting Pretraining Objectives for Tabular Deep Learning
Revisiting Pretraining Objectives for Tabular Deep Learning
Ivan Rubachev
Artem Alekberov
Yu. V. Gorishniy
Artem Babenko
LMTD
58
47
0
07 Jul 2022
Masked Surfel Prediction for Self-Supervised Point Cloud Learning
Masked Surfel Prediction for Self-Supervised Point Cloud Learning
Yabin Zhang
Jiehong Lin
Chenhang He
Yuxiao Chen
Kui Jia
Lei Zhang
3DPC
81
21
0
07 Jul 2022
Vision Transformers: State of the Art and Research Challenges
Vision Transformers: State of the Art and Research Challenges
Bo-Kai Ruan
Hong-Han Shuai
Wen-Huang Cheng
ViT
70
18
0
07 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
99
201
0
06 Jul 2022
Transformers are Adaptable Task Planners
Transformers are Adaptable Task Planners
Vidhi Jain
Yixin Lin
Eric Undersander
Yonatan Bisk
Akshara Rai
113
24
0
06 Jul 2022
Masked Autoencoders in 3D Point Cloud Representation Learning
Masked Autoencoders in 3D Point Cloud Representation Learning
Jincen Jiang
Xuequan Lu
Lizhi Zhao
Richard Dazeley
Meili Wang
3DPCViT
144
29
0
04 Jul 2022
DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
Zhuo Chen
Yufen Huang
Jiaoyan Chen
Yuxia Geng
Wen Zhang
Yin Fang
Jeff Z. Pan
Huajun Chen
VLM
135
66
0
04 Jul 2022
A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap
  between Weak Supervision and Dense Prediction
A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction
Wei Shen
Zelin Peng
Xuehui Wang
Huayu Wang
Jiazhong Cen
Dongsheng Jiang
Lingxi Xie
Xiaokang Yang
Qi Tian
VLM
114
84
0
04 Jul 2022
Masked Self-Supervision for Remaining Useful Lifetime Prediction in
  Machine Tools
Masked Self-Supervision for Remaining Useful Lifetime Prediction in Machine Tools
Haoren Guo
H. Zhu
Jiahui Wang
P. Vadakkepat
W. Ho
T. Lee
104
12
0
04 Jul 2022
Masked Autoencoder for Self-Supervised Pre-training on Lidar Point
  Clouds
Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds
Georg Hess
Johan Jaxing
Elias Svensson
David Hagerman
Christoffer Petersson
Lennart Svensson
3DPCViT
115
36
0
01 Jul 2022
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision
Sanat Ramesh
V. Srivastav
Deepak Alapatt
Tong Yu
Aditya Murali
...
Saurav Sharma
A. Fleurentin
Georgios Exarchakis
Alexandros Karargyris
N. Padoy
129
46
0
01 Jul 2022
Reading and Writing: Discriminative and Generative Modeling for
  Self-Supervised Text Recognition
Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition
Mingkun Yang
Minghui Liao
Pu Lu
Jing Wang
Shenggao Zhu
Hualin Luo
Qingzhen Tian
X. Bai
SSL
117
60
0
01 Jul 2022
GaitForeMer: Self-Supervised Pre-Training of Transformers via Human
  Motion Forecasting for Few-Shot Gait Impairment Severity Estimation
GaitForeMer: Self-Supervised Pre-Training of Transformers via Human Motion Forecasting for Few-Shot Gait Impairment Severity Estimation
Mark Endo
K. Poston
E. Sullivan
L. Fei-Fei
K. Pohl
Ehsan Adeli
86
19
0
30 Jun 2022
Transfer Learning with Deep Tabular Models
Transfer Learning with Deep Tabular Models
Roman Levin
Valeriia Cherepanova
Avi Schwarzschild
Arpit Bansal
C. Bayan Bruss
Tom Goldstein
A. Wilson
Micah Goldblum
OODFedMLLMTD
141
64
0
30 Jun 2022
Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using
  Self-Supervision
Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision
Zifeng Wang
Jimeng Sun
94
25
0
29 Jun 2022
BATFormer: Towards Boundary-Aware Lightweight Transformer for Efficient
  Medical Image Segmentation
BATFormer: Towards Boundary-Aware Lightweight Transformer for Efficient Medical Image Segmentation
Xian Lin
Li Yu
Kwang-Ting Cheng
Zengqiang Yan
ViTMedIm
91
35
0
29 Jun 2022
Masked World Models for Visual Control
Masked World Models for Visual Control
Younggyo Seo
Danijar Hafner
Hao Liu
Fangchen Liu
Stephen James
Kimin Lee
Pieter Abbeel
OffRL
177
149
0
28 Jun 2022
Learning Gait Representation from Massive Unlabelled Walking Videos: A
  Benchmark
Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark
Chao Fan
Saihui Hou
Jilong Wang
Yongzhen Huang
Shiqi Yu
CVBMSSL
125
38
0
28 Jun 2022
Self-supervised Learning in Remote Sensing: A Review
Self-supervised Learning in Remote Sensing: A Review
Yi Wang
C. Albrecht
Nassim Ait Ali Braham
Lichao Mou
Xiao Xiang Zhu
159
228
0
27 Jun 2022
Multiple Instance Learning with Mixed Supervision in Gleason Grading
Multiple Instance Learning with Mixed Supervision in Gleason Grading
Hao Bian
Zhucheng Shao
Yang Chen
Yifeng Wang
Haoqian Wang
Jian Zhang
Yongbing Zhang
48
10
0
26 Jun 2022
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive
  Learning
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
Cheng Tan
Zhangyang Gao
Lirong Wu
Yongjie Xu
Jun Xia
Siyuan Li
Stan Z. Li
125
110
0
24 Jun 2022
MaskViT: Masked Visual Pre-Training for Video Prediction
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
188
120
0
23 Jun 2022
Open Vocabulary Object Detection with Proposal Mining and Prediction
  Equalization
Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization
Peixian Chen
Kekai Sheng
Mengdan Zhang
Mingbao Lin
Yunhang Shen
Shaohui Lin
Bo Ren
Ke Li
VLMObjD
127
27
0
22 Jun 2022
Parallel Pre-trained Transformers (PPT) for Synthetic Data-based
  Instance Segmentation
Parallel Pre-trained Transformers (PPT) for Synthetic Data-based Instance Segmentation
Ming Li
Jie Wu
Jin Cai
J. Qin
Yuxi Ren
Xu Xiao
Min Zheng
Rui Wang
X. Pan
ViT
73
2
0
22 Jun 2022
Vicinity Vision Transformer
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
73
34
0
21 Jun 2022
SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders
SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders
Gang Li
Heliang Zheng
Daqing Liu
Chaoyue Wang
Fuchun Sun
Changwen Zheng
119
130
0
21 Jun 2022
Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point
  Clouds with Masked Occupancy Autoencoders
Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders
Chen Min
Xinli Xu
Dawei Zhao
Liang Xiao
Yiming Nie
Bin Dai
3DPC
148
53
0
20 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
121
35
0
19 Jun 2022
Pre-training Enhanced Spatial-temporal Graph Neural Network for
  Multivariate Time Series Forecasting
Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting
Zezhi Shao
Zhao Zhang
Fei Wang
Yongjun Xu
AI4TS
109
228
0
18 Jun 2022
Self-Supervised Learning for Videos: A Survey
Self-Supervised Learning for Videos: A Survey
Madeline Chantry Schiappa
Yogesh S Rawat
M. Shah
SSL
128
136
0
18 Jun 2022
Bag of Image Patch Embedding Behind the Success of Self-Supervised
  Learning
Bag of Image Patch Embedding Behind the Success of Self-Supervised Learning
Yubei Chen
Adrien Bardes
Zengyi Li
Yann LeCun
SSLAIFin
125
7
0
17 Jun 2022
Previous
123...888990...949596
Next