ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,778 papers shown
Title
Spiking sampling network for image sparse representation and dynamic
  vision sensor data compression
Spiking sampling network for image sparse representation and dynamic vision sensor data compression
Chunming Jiang
Yilei Zhang
21
0
0
08 Nov 2022
Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI &
  AIM 2022 Challenge: Report
Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report
Andrey D. Ignatov
Radu Timofte
Shuai Liu
Chaoyu Feng
Furui Bai
...
Xin Lou
Wei Zhou
Cong Pang
Haina Qin
Mingxuan Cai
96
24
0
07 Nov 2022
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining
Qiang Chen
Jian Wang
Chuchu Han
Shangang Zhang
Zexian Li
...
Haocheng Feng
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
ViTVLM
92
45
0
07 Nov 2022
Okapi: Generalising Better by Making Statistical Matches Match
Okapi: Generalising Better by Making Statistical Matches Match
Myles Bartlett
Sara Romiti
V. Sharmanska
Novi Quadrianto
83
3
0
07 Nov 2022
Generative Transformers for Design Concept Generation
Generative Transformers for Design Concept Generation
Qihao Zhu
Jianxi Luo
AI4CE
79
50
0
07 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
107
65
0
07 Nov 2022
Distilling Representations from GAN Generator via Squeeze and Span
Distilling Representations from GAN Generator via Squeeze and Span
Yu Yang
Xiaotian Cheng
Chang-rui Liu
Hakan Bilen
Xiang Ji
GAN
98
0
0
06 Nov 2022
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
28
5
0
04 Nov 2022
Could Giant Pretrained Image Models Extract Universal Representations?
Could Giant Pretrained Image Models Extract Universal Representations?
Yutong Lin
Ze Liu
Zheng Zhang
Han Hu
Nanning Zheng
Stephen Lin
Yue Cao
VLM
106
9
0
03 Nov 2022
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
87
1
0
03 Nov 2022
Attention-based Neural Cellular Automata
Attention-based Neural Cellular Automata
Mattie Tesfaldet
Derek Nowrouzezahrai
C. Pal
ViT
93
18
0
02 Nov 2022
RegCLR: A Self-Supervised Framework for Tabular Representation Learning
  in the Wild
RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the Wild
Weiyao Wang
Byung-Hak Kim
Varun Ganapathi
SSLLMTD
63
1
0
02 Nov 2022
Siamese Transition Masked Autoencoders as Uniform Unsupervised Visual
  Anomaly Detector
Siamese Transition Masked Autoencoders as Uniform Unsupervised Visual Anomaly Detector
Haiming Yao
Xue Wang
Wenyong Yu
87
9
0
01 Nov 2022
RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful
  Representation from X-Ray Images
RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representation from X-Ray Images
Guang Li
Ren Togo
Takahiro Ogawa
Miki Haseyama
56
0
0
01 Nov 2022
Self-supervised Character-to-Character Distillation for Text Recognition
Self-supervised Character-to-Character Distillation for Text Recognition
Tongkun Guan
Wei Shen
Xuehang Yang
Qi Feng
Zekun Jiang
Xiaokang Yang
147
25
0
01 Nov 2022
Pixel-Wise Contrastive Distillation
Pixel-Wise Contrastive Distillation
Junqiang Huang
Zichao Guo
133
4
0
01 Nov 2022
ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder
  Facial Diagnosis
ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis
Xu Cao
Wenqian Ye
Elena Sizikova
Xue Bai
Megan Coffee
H. Zeng
Jianguo Cao
64
17
0
30 Oct 2022
A simple, efficient and scalable contrastive masked autoencoder for
  learning visual representations
A simple, efficient and scalable contrastive masked autoencoder for learning visual representations
Shlok Kumar Mishra
Joshua Robinson
Huiwen Chang
David Jacobs
Aaron Sarna
Aaron Maschinot
Dilip Krishnan
DiffM
114
31
0
30 Oct 2022
Unsupervised Learning of Structured Representations via Closed-Loop
  Transcription
Unsupervised Learning of Structured Representations via Closed-Loop Transcription
Shengbang Tong
Xili Dai
Yubei Chen
Mingyang Li
Zengyi Li
Brent Yi
Yann LeCun
Yi Ma
SSLDRL
96
7
0
30 Oct 2022
Parameter-Efficient Tuning Makes a Good Classification Head
Parameter-Efficient Tuning Makes a Good Classification Head
Zhuoyi Yang
Ming Ding
Yanhui Guo
Qingsong Lv
Jie Tang
VLM
108
14
0
30 Oct 2022
Multimodal Transformer for Parallel Concatenated Variational
  Autoencoders
Multimodal Transformer for Parallel Concatenated Variational Autoencoders
Stephen D. Liang
J. Mendel
ViT
70
5
0
28 Oct 2022
Spectrograms Are Sequences of Patches
Spectrograms Are Sequences of Patches
Leyi Zhao
Yi Li
SSL
57
0
0
28 Oct 2022
Facial Action Unit Detection and Intensity Estimation from
  Self-supervised Representation
Facial Action Unit Detection and Intensity Estimation from Self-supervised Representation
Bowen Ma
Rudong An
Wei Zhang
Yu-qiong Ding
Zeng Zhao
Rongsheng Zhang
Tangjie Lv
Changjie Fan
Zhipeng Hu
CVBM
103
21
0
28 Oct 2022
MAEEG: Masked Auto-encoder for EEG Representation Learning
MAEEG: Masked Auto-encoder for EEG Representation Learning
H. Chien
Hanlin Goh
Christopher M. Sandino
Joseph Y. Cheng
71
49
0
27 Oct 2022
Exploring Effective Distillation of Self-Supervised Speech Models for
  Automatic Speech Recognition
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Yujin Wang
Changli Tang
Ziyang Ma
Zhisheng Zheng
Xie Chen
Weiqiang Zhang
128
1
0
27 Oct 2022
ProContEXT: Exploring Progressive Context Transformer for Tracking
ProContEXT: Exploring Progressive Context Transformer for Tracking
Jinpeng Lan
Zhi-Qi Cheng
Ju He
Chenyang Li
Bin Luo
Xueting Bao
Wangmeng Xiang
Yifeng Geng
Xuansong Xie
102
31
0
27 Oct 2022
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised
  Learning for Text-To-Speech
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Takaaki Saeki
Heiga Zen
Zhehuai Chen
Nobuyuki Morioka
Gary Wang
Yu Zhang
Ankur Bapna
Andrew Rosenberg
Bhuvana Ramabhadran
130
20
0
27 Oct 2022
Masked Autoencoders Are Articulatory Learners
Masked Autoencoders Are Articulatory Learners
Ahmed Adel Attia
C. Espy-Wilson
38
6
0
27 Oct 2022
Masked Vision-Language Transformer in Fashion
Masked Vision-Language Transformer in Fashion
Ge-Peng Ji
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Daniel Gehrig
Luc Van Gool
90
25
0
27 Oct 2022
Masked Modeling Duo: Learning Representations by Encouraging Both
  Networks to Model the Input
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
105
33
0
26 Oct 2022
SemFormer: Semantic Guided Activation Transformer for Weakly Supervised
  Semantic Segmentation
SemFormer: Semantic Guided Activation Transformer for Weakly Supervised Semantic Segmentation
Junliang Chen
Xiaodong Zhao
Cheng Luo
Linlin Shen
ViT
118
3
0
26 Oct 2022
Explicitly Increasing Input Information Density for Vision Transformers
  on Small Datasets
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
80
3
0
25 Oct 2022
PlanT: Explainable Planning Transformers via Object-Level
  Representations
PlanT: Explainable Planning Transformers via Object-Level Representations
Katrin Renz
Kashyap Chitta
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
Andreas Geiger
ViT
119
101
0
25 Oct 2022
Learning Explicit Object-Centric Representations with Vision
  Transformers
Learning Explicit Object-Centric Representations with Vision Transformers
Oscar Vikström
Alexander Ilin
OCLViT
79
4
0
25 Oct 2022
The Robustness Limits of SoTA Vision Models to Natural Variation
The Robustness Limits of SoTA Vision Models to Natural Variation
Mark Ibrahim
Q. Garrido
Ari S. Morcos
Diane Bouchacourt
VLM
99
16
0
24 Oct 2022
Instruction-Following Agents with Multimodal Transformer
Instruction-Following Agents with Multimodal Transformer
Hao Liu
Lisa Lee
Kimin Lee
Pieter Abbeel
LM&Ro
125
11
0
24 Oct 2022
Robust Self-Supervised Learning with Lie Groups
Robust Self-Supervised Learning with Lie Groups
Mark Ibrahim
Diane Bouchacourt
Ari S. Morcos
SSLOOD
73
6
0
24 Oct 2022
Deep Model Reassembly
Deep Model Reassembly
Xingyi Yang
Zhou Daquan
Songhua Liu
Jingwen Ye
Xinchao Wang
MoMe
98
129
0
24 Oct 2022
Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised
  Object Discovery with Transformers
Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers
Zhiwei Lin
Ze Yang
Yongtao Wang
ViT
83
2
0
24 Oct 2022
Holistically-Attracted Wireframe Parsing: From Supervised to
  Self-Supervised Learning
Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning
Nan Xue
Tianfu Wu
Song Bai
Fu-Dong Wang
Gui-Song Xia
Lefei Zhang
Philip Torr
60
26
0
24 Oct 2022
Removing Radio Frequency Interference from Auroral Kilometric Radiation
  with Stacked Autoencoders
Removing Radio Frequency Interference from Auroral Kilometric Radiation with Stacked Autoencoders
Allen Chang
M. Knapp
J. Labelle
J. Swoboda
R. Volz
P. Erickson
19
2
0
24 Oct 2022
Delving into Masked Autoencoders for Multi-Label Thorax Disease
  Classification
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao
Yutong Bai
Alan Yuille
Zongwei Zhou
MedImViT
82
62
0
23 Oct 2022
Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present
  and Future
Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present and Future
Guo-Jun Qi
M. Shah
SSL
78
8
0
23 Oct 2022
Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive
  Positive or Negative Data Augmentation
Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation
Atsuyuki Miyai
Qing Yu
Daiki Ikami
Go Irie
Kiyoharu Aizawa
SSL
91
5
0
23 Oct 2022
Neural Eigenfunctions Are Structured Representation Learners
Neural Eigenfunctions Are Structured Representation Learners
Zhijie Deng
Jiaxin Shi
Hao Zhang
Peng Cui
Cewu Lu
Jun Zhu
109
14
0
23 Oct 2022
Spectrum-BERT: Pre-training of Deep Bidirectional Transformers for
  Spectral Classification of Chinese Liquors
Spectrum-BERT: Pre-training of Deep Bidirectional Transformers for Spectral Classification of Chinese Liquors
Yansong Wang
Yundong Sun
Yan-Jiao Fu
Dongjie Zhu
Zhaoshuo Tian
46
6
0
22 Oct 2022
Accumulated Trivial Attention Matters in Vision Transformers on Small
  Datasets
Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets
Xiangyu Chen
Qinghao Hu
Kaidong Li
Cuncong Zhong
Guanghui Wang
ViT
81
13
0
22 Oct 2022
Boosting vision transformers for image retrieval
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
117
33
0
21 Oct 2022
i-MAE: Are Latent Representations in Masked Autoencoders Linearly
  Separable?
i-MAE: Are Latent Representations in Masked Autoencoders Linearly Separable?
Kevin Zhang
Zhiqiang Shen
62
8
0
20 Oct 2022
Self-Supervised Learning via Maximum Entropy Coding
Self-Supervised Learning via Maximum Entropy Coding
Xin Liu
Zhongdao Wang
Yali Li
Shengjin Wang
SSL
134
43
0
20 Oct 2022
Previous
123...828384...949596
Next