ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,777 papers shown
Title
ShapeFormer: Transformer-based Shape Completion via Sparse
  Representation
ShapeFormer: Transformer-based Shape Completion via Sparse Representation
Xingguang Yan
Liqiang Lin
Niloy J. Mitra
Dani Lischinski
Daniel Cohen-Or
Hui Huang
ViT
182
118
0
25 Jan 2022
Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image
  Encoders
Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
Zeyang Sha
Xinlei He
Ning Yu
Michael Backes
Yang Zhang
136
35
0
19 Jan 2022
TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval
TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval
Yue Ruan
Han-Hung Lee
Yiming Zhang
Ke Zhang
Angel X. Chang
95
22
0
19 Jan 2022
RePre: Improving Self-Supervised Vision Transformer with Reconstructive
  Pre-training
RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training
Luyang Wang
Feng Liang
Yangguang Li
Honggang Zhang
Wanli Ouyang
Jing Shao
ViT
94
25
0
18 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
141
107
0
16 Jan 2022
Transferability in Deep Learning: A Survey
Transferability in Deep Learning: A Survey
Junguang Jiang
Yang Shu
Jianmin Wang
Mingsheng Long
OOD
93
104
0
15 Jan 2022
ViT2Hash: Unsupervised Information-Preserving Hashing
ViT2Hash: Unsupervised Information-Preserving Hashing
Qinkang Gong
Liangdao Wang
Hanjiang Lai
Yan Pan
Jian Yin
30
4
0
14 Jan 2022
Time Series Generation with Masked Autoencoder
Time Series Generation with Masked Autoencoder
Meng-yue Zha
SiuTim Wong
Mengqi Liu
Tong Zhang
Kani Chen
SyDaAI4TS
64
17
0
14 Jan 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
202
5,261
0
10 Jan 2022
VGAER: Graph Neural Network Reconstruction based Community Detection
VGAER: Graph Neural Network Reconstruction based Community Detection
Chenyang Qiu
Zhaoci Huang
Wenzhe Xu
Huijia Li
57
17
0
08 Jan 2022
MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs
MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs
Qiaoyu Tan
Ninghao Liu
Xiao Shi Huang
Rui Chen
Soo-Hyun Choi
Helen Zhou
SSL
76
41
0
07 Jan 2022
Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big
  Task
Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big Task
Quan Feng
Songcan Chen
108
5
0
07 Jan 2022
Implicit Autoencoder for Point-Cloud Self-Supervised Representation
  Learning
Implicit Autoencoder for Point-Cloud Self-Supervised Representation Learning
Siming Yan
Zhenpei Yang
Haoxiang Li
Chen Song
Li Guan
Hao Kang
G. Hua
Qi-Xing Huang
3DPC
146
63
0
03 Jan 2022
SLIP: Self-supervision meets Language-Image Pre-training
SLIP: Self-supervision meets Language-Image Pre-training
Norman Mu
Alexander Kirillov
David Wagner
Saining Xie
VLMCLIP
158
492
0
23 Dec 2021
Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
Alaaeldin El-Nouby
Gautier Izacard
Hugo Touvron
Ivan Laptev
Hervé Jégou
Edouard Grave
SSL
106
152
0
20 Dec 2021
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
Wenbo Li
Xin Lu
Shengju Qian
Jiangbo Lu
Xinming Zhang
Jiaya Jia
ViT
140
88
0
19 Dec 2021
RELAX: Representation Learning Explainability
RELAX: Representation Learning Explainability
Kristoffer Wickstrøm
Daniel J. Trosten
Sigurd Løkse
Ahcène Boubekki
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
FAtt
51
15
0
19 Dec 2021
Contrastive Vision-Language Pre-training with Limited Resources
Contrastive Vision-Language Pre-training with Limited Resources
Quan Cui
Boyan Zhou
Yu Guo
Weidong Yin
Hao Wu
Osamu Yoshie
Yubo Chen
VLMCLIP
53
34
0
17 Dec 2021
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
190
674
0
16 Dec 2021
Rethinking Nearest Neighbors for Visual Classification
Rethinking Nearest Neighbors for Visual Classification
Menglin Jia
Bor-Chun Chen
Zuxuan Wu
Claire Cardie
Serge Belongie
Ser-Nam Lim
SSL
92
10
0
15 Dec 2021
Self-Supervised Modality-Aware Multiple Granularity Pre-Training for
  RGB-Infrared Person Re-Identification
Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification
Lin Wan
Qianyan Jing
Zongyuan Sun
Chuan Zhang
Zhihang Li
Yehansen Chen
SSL
77
5
0
12 Dec 2021
PE-former: Pose Estimation Transformer
PE-former: Pose Estimation Transformer
Paschalis Panteleris
Antonis Argyros
ViT
74
12
0
09 Dec 2021
Semi-Supervised Medical Image Segmentation via Cross Teaching between
  CNN and Transformer
Semi-Supervised Medical Image Segmentation via Cross Teaching between CNN and Transformer
Xiangde Luo
Minhao Hu
Tao Song
Guotai Wang
Shaoting Zhang
ViTMedIm
70
211
0
09 Dec 2021
ViewCLR: Learning Self-supervised Video Representation for Unseen
  Viewpoints
ViewCLR: Learning Self-supervised Video Representation for Unseen Viewpoints
Srijan Das
Michael S. Ryoo
SSL
90
20
0
07 Dec 2021
E$^2$(GO)MOTION: Motion Augmented Event Stream for Egocentric Action
  Recognition
E2^22(GO)MOTION: Motion Augmented Event Stream for Egocentric Action Recognition
Chiara Plizzari
M. Planamente
Gabriele Goletto
Marco Cannici
Emanuele Gusso
Matteo Matteucci
Barbara Caputo
EgoV
104
57
0
07 Dec 2021
Label-Efficient Semantic Segmentation with Diffusion Models
Label-Efficient Semantic Segmentation with Diffusion Models
Dmitry Baranchuk
Ivan Rubachev
A. Voynov
Valentin Khrulkov
Artem Babenko
DiffMVLM
285
539
0
06 Dec 2021
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence
  Model Tackles All SMAC Tasks
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Linghui Meng
Muning Wen
Yaodong Yang
Chenyang Le
Xiyun Li
Weinan Zhang
Ying Wen
Haifeng Zhang
Jun Wang
Bo Xu
OffRL
98
43
0
06 Dec 2021
A Survey of Deep Learning for Low-Shot Object Detection
A Survey of Deep Learning for Low-Shot Object Detection
Qihan Huang
Haofei Zhang
Mengqi Xue
Mingli Song
Xiuming Zhang
ObjD
117
19
0
06 Dec 2021
BEVT: BERT Pretraining of Video Transformers
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
108
209
0
02 Dec 2021
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLMCLIP
224
582
0
02 Dec 2021
PTCT: Patches with 3D-Temporal Convolutional Transformer Network for
  Precipitation Nowcasting
PTCT: Patches with 3D-Temporal Convolutional Transformer Network for Precipitation Nowcasting
Ziao Yang
Xiangru Yang
Qifeng Lin
ViTAI4TS
72
4
0
02 Dec 2021
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
Liting Lin
Heng Fan
Zhipeng Zhang
Yong-mei Xu
Haibin Ling
ViT
107
322
0
02 Dec 2021
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
Sara Atito
Muhammad Awais
Ammarah Farooq
Zhenhua Feng
J. Kittler
56
17
0
30 Nov 2021
EdiBERT, a generative model for image editing
EdiBERT, a generative model for image editing
Thibaut Issenhuth
Ugo Tanielian
Jérémie Mary
David Picard
DiffM
100
12
0
30 Nov 2021
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
  Modeling
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Xumin Yu
Lulu Tang
Yongming Rao
Tiejun Huang
Jie Zhou
Jiwen Lu
3DPC
173
689
0
29 Nov 2021
Natural Scene Text Editing Based on AI
Natural Scene Text Editing Based on AI
Yujie Zhang
54
0
0
26 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
150
245
0
24 Nov 2021
ViCE: Improving Dense Representation Learning by Superpixelization and
  Contrasting Cluster Assignment
ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment
Robin Karlsson
Tomoki Hayashi
Keisuke Fujii
Alexander Carballo
Kento Ohtani
K. Takeda
SSL
60
4
0
24 Nov 2021
RegionCL: Can Simple Region Swapping Contribute to Contrastive Learning?
RegionCL: Can Simple Region Swapping Contribute to Contrastive Learning?
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
SSL
134
18
0
24 Nov 2021
Learning Representation for Clustering via Prototype Scattering and
  Positive Sampling
Learning Representation for Clustering via Prototype Scattering and Positive Sampling
Zhizhong Huang
Jie Chen
Junping Zhang
Hongming Shan
84
98
0
23 Nov 2021
RIO: Rotation-equivariance supervised learning of robust inertial
  odometry
RIO: Rotation-equivariance supervised learning of robust inertial odometry
Caifa Zhou
Xiya Cao
Dandan Zeng
Yongliang Wang
OODSSL
60
23
0
23 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
113
170
0
22 Nov 2021
Attention Mechanisms in Computer Vision: A Survey
Attention Mechanisms in Computer Vision: A Survey
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
142
1,735
0
15 Nov 2021
A Survey of Visual Transformers
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGSViT
189
356
0
11 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
215
100
0
07 Nov 2021
Towards the Generalization of Contrastive Self-Supervised Learning
Towards the Generalization of Contrastive Self-Supervised Learning
Weiran Huang
Mingyang Yi
Xuyang Zhao
Zihao Jiang
SSL
98
115
0
01 Nov 2021
GenURL: A General Framework for Unsupervised Representation Learning
GenURL: A General Framework for Unsupervised Representation Learning
Siyuan Li
Zicheng Liu
Z. Zang
Di Wu
Zhiyuan Chen
Stan Z. Li
OOD3DGSOffRL
136
9
0
27 Oct 2021
Towards Language-guided Visual Recognition via Dynamic Convolutions
Towards Language-guided Visual Recognition via Dynamic Convolutions
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Yongjian Wu
Yue Gao
Rongrong Ji
ObjD
98
19
0
17 Oct 2021
Self-Supervised Learning by Estimating Twin Class Distributions
Self-Supervised Learning by Estimating Twin Class Distributions
Feng Wang
Tao Kong
Rufeng Zhang
Huaping Liu
Hang Li
SSL
105
20
0
14 Oct 2021
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Benyou Wang
Qianqian Xie
Jiahuan Pei
Zhihong Chen
Prayag Tiwari
Zhao Li
Jie Fu
LM&MAAI4CE
154
172
0
11 Oct 2021
Previous
123...949596
Next