ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,778 papers shown
Title
ClimaX: A foundation model for weather and climate
ClimaX: A foundation model for weather and climate
Tung Nguyen
Johannes Brandstetter
Ashish Kapoor
Jayesh K. Gupta
Aditya Grover
AI4ClAI4CE
117
271
0
24 Jan 2023
RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in
  Autonomous Driving
RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in Autonomous Driving
Angelika Ando
Spyros Gidaris
Andrei Bursuc
Gilles Puy
Alexandre Boulch
Renaud Marlet
ViT3DPC
73
79
0
24 Jan 2023
Zorro: the masked multimodal transformer
Zorro: the masked multimodal transformer
Adrià Recasens
Jason Lin
João Carreira
Drew Jaegle
Luyu Wang
...
Pauline Luc
Antoine Miech
Lucas Smaira
Ross Hemsley
Andrew Zisserman
92
21
0
23 Jan 2023
A Simple Recipe for Competitive Low-compute Self supervised Vision
  Models
A Simple Recipe for Competitive Low-compute Self supervised Vision Models
Quentin Duval
Ishan Misra
Nicolas Ballas
70
9
0
23 Jan 2023
Self-Supervised Image Representation Learning: Transcending Masking with
  Paired Image Overlay
Self-Supervised Image Representation Learning: Transcending Masking with Paired Image Overlay
Yinhe Li
Han Ding
Shao-jun Wang
SSL
35
0
0
23 Jan 2023
Ti-MAE: Self-Supervised Masked Time Series Autoencoders
Ti-MAE: Self-Supervised Masked Time Series Autoencoders
Zhe Li
Zhongwen Rao
Lujia Pan
Pengyun Wang
Zenglin Xu
AI4TS
83
53
0
21 Jan 2023
Self-Supervised Learning for Data Scarcity in a Fatigue Damage
  Prognostic Problem
Self-Supervised Learning for Data Scarcity in a Fatigue Damage Prognostic Problem
A. Akrim
C. Gogu
R. Vingerhoeds
M. Salaün
AI4CE
102
25
0
20 Jan 2023
Multiview Compressive Coding for 3D Reconstruction
Multiview Compressive Coding for 3D Reconstruction
Chaozheng Wu
Justin Johnson
Jitendra Malik
Christoph Feichtenhofer
Georgia Gkioxari
128
75
0
19 Jan 2023
Self-Supervised Learning from Images with a Joint-Embedding Predictive
  Architecture
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
Mahmoud Assran
Quentin Duval
Ishan Misra
Piotr Bojanowski
Pascal Vincent
Michael G. Rabbat
Yann LeCun
Nicolas Ballas
SSLAI4TSMDE
145
364
0
19 Jan 2023
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
Floris Weers
Vaishaal Shankar
Angelos Katharopoulos
Yinfei Yang
Tom Gunter
CLIP
54
5
0
19 Jan 2023
CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
Aviad Aberdam
David Bensaid
Alona Golts
Roy Ganz
Oren Nuriel
Royee Tichauer
Shai Mazor
Ron Litman
VLMCLIP
92
13
0
18 Jan 2023
ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised
  Medical Image Representations
ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised Medical Image Representations
Chinmay Prabhakar
Hongwei Bran Li
Jiancheng Yang
Suprosana Shit
Benedikt Wiestler
Bjoern Menze
ViTMedIm
75
11
0
18 Jan 2023
Learning Customized Visual Models with Retrieval-Augmented Knowledge
Learning Customized Visual Models with Retrieval-Augmented Knowledge
Haotian Liu
Kilho Son
Jianwei Yang
Ce Liu
Jianfeng Gao
Yong Jae Lee
Chunyuan Li
VLM
129
56
0
17 Jan 2023
GLIGEN: Open-Set Grounded Text-to-Image Generation
GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li
Haotian Liu
Qingyang Wu
Fangzhou Mu
Jianwei Yang
Jianfeng Gao
Chunyuan Li
Yong Jae Lee
VLM
150
603
1
17 Jan 2023
Vision Learners Meet Web Image-Text Pairs
Vision Learners Meet Web Image-Text Pairs
Bingchen Zhao
Quan Cui
Hao Wu
Osamu Yoshie
Cheng Yang
Oisin Mac Aodha
VLM
86
5
0
17 Jan 2023
Long Range Pooling for 3D Large-Scale Scene Understanding
Long Range Pooling for 3D Large-Scale Scene Understanding
Xiang-Li Li
Meng-Hao Guo
Tai-Jiang Mu
Ralph Robert Martin
Shiyong Hu
3DV3DPC
66
2
0
17 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
95
11
0
17 Jan 2023
Cross-domain Self-supervised Framework for Photoacoustic Computed
  Tomography Image Reconstruction
Cross-domain Self-supervised Framework for Photoacoustic Computed Tomography Image Reconstruction
Hengrong Lan
Lijie Huang
Zhiqiang Li
Jing Lv
Jianwen Luo
ViTOOD
88
1
0
17 Jan 2023
CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition
CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition
Cheng Lu
Xiaojie Jin
Zhicheng Huang
Qibin Hou
Mingg-Ming Cheng
Jiashi Feng
61
9
0
15 Jan 2023
A Survey on Self-supervised Learning: Algorithms, Applications, and
  Future Trends
A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends
Jie Gui
Tuo Chen
Jing Zhang
Qiong Cao
Zhe Sun
Haoran Luo
Dacheng Tao
232
161
0
13 Jan 2023
SemPPL: Predicting pseudo-labels for better contrastive representations
SemPPL: Predicting pseudo-labels for better contrastive representations
Matko Bovsnjak
Pierre Harvey Richemond
Nenad Tomašev
Florian Strub
Jacob Walker
Felix Hill
Lars Buesing
Razvan Pascanu
Charles Blundell
Jovana Mitrović
SSLVLM
101
9
0
12 Jan 2023
Toward Building General Foundation Models for Language, Vision, and
  Vision-Language Understanding Tasks
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLMAI4CELRM
122
17
0
12 Jan 2023
Dynamic Background Reconstruction via MAE for Infrared Small Target
  Detection
Dynamic Background Reconstruction via MAE for Infrared Small Target Detection
Jingchao Peng
Haitao Zhao
Kaijie Zhao
Zhongze Wang
Lujian Yao
39
2
0
11 Jan 2023
Vision Transformers Are Good Mask Auto-Labelers
Vision Transformers Are Good Mask Auto-Labelers
Shiyi Lan
Xitong Yang
Zhiding Yu
Zuxuan Wu
J. Álvarez
Anima Anandkumar
ISegViTMedIm
95
19
0
10 Jan 2023
Designing BERT for Convolutional Networks: Sparse and Hierarchical
  Masked Modeling
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Keyu Tian
Yi Jiang
Qishuai Diao
Chen Lin
Liwei Wang
Zehuan Yuan
89
106
0
09 Jan 2023
Learning the Relation between Similarity Loss and Clustering Loss in
  Self-Supervised Learning
Learning the Relation between Similarity Loss and Clustering Loss in Self-Supervised Learning
Jidong Ge
YuXiang Liu
Jie Gui
Lanting Fang
Ming Lin
James T. Kwok
LiGuo Huang
B. Luo
SSL
86
5
0
08 Jan 2023
CiT: Curation in Training for Effective Vision-Language Data
CiT: Curation in Training for Effective Vision-Language Data
Hu Xu
Saining Xie
Po-Yao (Bernie) Huang
Licheng Yu
Russ Howes
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
VLMDiffM
69
26
0
05 Jan 2023
CRADL: Contrastive Representations for Unsupervised Anomaly Detection
  and Localization
CRADL: Contrastive Representations for Unsupervised Anomaly Detection and Localization
Carsten T. Lüth
David Zimmerer
Gregor Koehler
Paul F. Jaeger
Hyunjin Park
Jens Petersen
Klaus H. Maier-Hein
UQCVMedIm
51
4
0
05 Jan 2023
Single-round Self-supervised Distributed Learning using Vision
  Transformer
Single-round Self-supervised Distributed Learning using Vision Transformer
Sangjoon Park
Ik-jae Lee
Jun Won Kim
Jong Chul Ye
FedMLMedIm
69
1
0
05 Jan 2023
Event Camera Data Pre-training
Event Camera Data Pre-training
Yan Yang
Liyuan Pan
Liu Liu
73
36
0
05 Jan 2023
Infomaxformer: Maximum Entropy Transformer for Long Time-Series
  Forecasting Problem
Infomaxformer: Maximum Entropy Transformer for Long Time-Series Forecasting Problem
Peiwang Tang
Xianchao Zhang
AI4TS
116
6
0
04 Jan 2023
Semi-MAE: Masked Autoencoders for Semi-supervised Vision Transformers
Semi-MAE: Masked Autoencoders for Semi-supervised Vision Transformers
Haojie Yu
Kangnian Zhao
Xiaoming Xu
ViT
81
1
0
04 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang
Mitesh Singh
Lorenzo Torresani
EgoV
126
26
0
03 Jan 2023
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
Sucheng Ren
Fangyun Wei
Zheng Zhang
Han Hu
146
43
0
03 Jan 2023
Policy Pre-training for Autonomous Driving via Self-supervised Geometric
  Modeling
Policy Pre-training for Autonomous Driving via Self-supervised Geometric Modeling
Peng Wu
Li Chen
Hongyang Li
Xiaosong Jia
Junchi Yan
Yu Qiao
160
29
0
03 Jan 2023
PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part
  Segmentation
PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Xiangtai Li
Shilin Xu
Yibo Yang
Haobo Yuan
Guangliang Cheng
Yu Tong
Zhouchen Lin
Ming-Hsuan Yang
Dacheng Tao
ViT
160
21
0
03 Jan 2023
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
SyDa
163
825
0
02 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
278
560
0
02 Jan 2023
Deep Learning Technique for Human Parsing: A Survey and Outlook
Deep Learning Technique for Human Parsing: A Survey and Outlook
Lu Yang
Wenhe Jia
Shane Li
Q. Song
ViT
143
20
0
01 Jan 2023
Towards Reliable Medical Image Segmentation by utilizing Evidential
  Calibrated Uncertainty
Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty
K. Zou
Yidi Chen
Ling Huang
Xuedong Yuan
Xiaojing Shen
Meng Wang
Rick Siow Mong Goh
Yong-Jin Liu
Huazhu Fu
UQCV
90
4
0
01 Jan 2023
Disjoint Masking with Joint Distillation for Efficient Masked Image
  Modeling
Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling
Xin Ma
Chang-Shu Liu
Chunyu Xie
Long Ye
Yafeng Deng
Xiang Ji
140
10
0
31 Dec 2022
Ponder: Point Cloud Pre-training via Neural Rendering
Ponder: Point Cloud Pre-training via Neural Rendering
Di Huang
Sida Peng
Tong He
Honghui Yang
Xiaowei Zhou
Wanli Ouyang
SSL3DPC
116
43
0
31 Dec 2022
An Analysis of Attention via the Lens of Exchangeability and Latent
  Variable Models
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
128
13
0
30 Dec 2022
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial
  Representation Learning
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning
Colorado Reed
Ritwik Gupta
Shufan Li
S. Brockman
Christopher Funk
Brian Clipp
Kurt Keutzer
Salvatore Candido
M. Uyttendaele
Trevor Darrell
167
192
0
30 Dec 2022
Improving Visual Representation Learning through Perceptual
  Understanding
Improving Visual Representation Learning through Perceptual Understanding
Samyakh Tukra
Frederick Hoffman
Ken Chatfield
86
5
0
30 Dec 2022
3D Masked Modelling Advances Lesion Classification in Axial T2w Prostate
  MRI
3D Masked Modelling Advances Lesion Classification in Axial T2w Prostate MRI
Alvaro Fernandez-Quilez
C. Andersen
T. Eftestøl
S. R. Kjosavik
K. Oppedal
25
2
0
29 Dec 2022
PointVST: Self-Supervised Pre-training for 3D Point Clouds via
  View-Specific Point-to-Image Translation
PointVST: Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation
Qijian Zhang
Junhui Hou
3DPC
111
10
0
29 Dec 2022
Swin MAE: Masked Autoencoders for Small Datasets
Swin MAE: Masked Autoencoders for Small Datasets
Zián Xu
Yin Dai
Fayu Liu
Weibin Chen
Yue Liu
Li-Li Shi
Sheng Liu
Yuhang Zhou
SyDaMedImViT
151
28
0
28 Dec 2022
Exploring Vision Transformers as Diffusion Learners
Exploring Vision Transformers as Diffusion Learners
He Cao
Jianan Wang
Tianhe Ren
Xianbiao Qi
Yihao Chen
Yuan Yao
Lefei Zhang
83
10
0
28 Dec 2022
Representation Separation for Semantic Segmentation with Vision
  Transformers
Representation Separation for Semantic Segmentation with Vision Transformers
Yuanduo Hong
Huihui Pan
Weichao Sun
Xinghu Yu
Huijun Gao
ViT
83
5
0
28 Dec 2022
Previous
123...777879...949596
Next