ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis
LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis
Di Chang
Yufeng Yin
Zongjia Li
Minh Tran
M. Soleymani
CVBM
115
14
0
18 Aug 2023
SimFIR: A Simple Framework for Fisheye Image Rectification with
  Self-supervised Representation Learning
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
Hao Feng
Wendi Wang
Jiajun Deng
Wen-gang Zhou
Li Li
Houqiang Li
79
13
0
17 Aug 2023
Cross-city Few-Shot Traffic Forecasting via Traffic Pattern Bank
Cross-city Few-Shot Traffic Forecasting via Traffic Pattern Bank
Zhanyu Liu
Guanjie Zheng
Yanwei Yu
AI4TS
85
34
0
17 Aug 2023
Auxiliary Tasks Benefit 3D Skeleton-based Human Motion Prediction
Auxiliary Tasks Benefit 3D Skeleton-based Human Motion Prediction
Chenxin Xu
R. Tan
Yuhong Tan
Siheng Chen
Xinchao Wang
Yanfeng Wang
3DH
121
22
0
17 Aug 2023
Identity-Seeking Self-Supervised Representation Learning for
  Generalizable Person Re-identification
Identity-Seeking Self-Supervised Representation Learning for Generalizable Person Re-identification
Zhaopeng Dou
Zhongdao Wang
Yali Li
Shengjin Wang
85
15
0
17 Aug 2023
SRMAE: Masked Image Modeling for Scale-Invariant Deep Representations
SRMAE: Masked Image Modeling for Scale-Invariant Deep Representations
Zhiming Wang
Lin Gu
Feng Lu
96
0
0
17 Aug 2023
Learning to In-paint: Domain Adaptive Shape Completion for 3D Organ
  Segmentation
Learning to In-paint: Domain Adaptive Shape Completion for 3D Organ Segmentation
Mingjin Chen
Yongkang He
Yongyi Lu
Zhi-Yi Yang
MedIm
78
1
0
17 Aug 2023
InsMapper: Exploring Inner-instance Information for Vectorized HD
  Mapping
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu
Kenneth K. Y. Wong
Hengshuang Zhao
88
11
0
16 Aug 2023
Test-Time Poisoning Attacks Against Test-Time Adaptation Models
Test-Time Poisoning Attacks Against Test-Time Adaptation Models
Tianshuo Cong
Xinlei He
Yun Shen
Yang Zhang
AAMLTTA
77
6
0
16 Aug 2023
Stable and Causal Inference for Discriminative Self-supervised Deep
  Visual Representations
Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations
Yuewei Yang
Hai Helen Li
Yiran Chen
CMLOOD
100
1
0
16 Aug 2023
Contrastive Learning for Lane Detection via Cross-Similarity
Contrastive Learning for Lane Detection via Cross-Similarity
Ali Zoljodi
S. Abadijou
Mina Alibeigi
Masoud Daneshtalab
SSL
79
7
0
16 Aug 2023
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision
  Tasks with Real-time Performance on Mobile Device
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
Wei Gou
Ziyao Yi
Yan Xiang
Sha Li
Zibin Liu
Dehui Kong
Ke Xu
67
5
0
16 Aug 2023
Memory-and-Anticipation Transformer for Online Action Understanding
Memory-and-Anticipation Transformer for Online Action Understanding
Jiahao Wang
Guo Chen
Yifei Huang
Liming Wang
Tong Lu
OffRL
137
43
0
15 Aug 2023
CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D
  Reconstruction
CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction
Yan Di
Chenyang Zhang
Pengyuan Wang
Guangyao Zhai
Ruida Zhang
Fabian Manhardt
Benjamin Busam
Xiangyang Ji
F. Tombari
DiffM
70
12
0
15 Aug 2023
A Unified Masked Autoencoder with Patchified Skeletons for Motion
  Synthesis
A Unified Masked Autoencoder with Patchified Skeletons for Motion Synthesis
Esteve Valls Mascaro
Hyemin Ahn
Dongheui Lee
CVBM
75
7
0
14 Aug 2023
UniWorld: Autonomous Driving Pre-training via World Models
UniWorld: Autonomous Driving Pre-training via World Models
Chen Min
Dawei Zhao
Liang Xiao
Yiming Nie
Bin Dai
VGen
74
23
0
14 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
134
1
0
14 Aug 2023
Masked Motion Predictors are Strong 3D Action Representation Learners
Masked Motion Predictors are Strong 3D Action Representation Learners
Yunyao Mao
Jiajun Deng
Wen-gang Zhou
Yao Fang
Wanli Ouyang
Houqiang Li
3DPC
117
38
0
14 Aug 2023
PatchContrast: Self-Supervised Pre-training for 3D Object Detection
PatchContrast: Self-Supervised Pre-training for 3D Object Detection
Oren Shrout
Ori Nitzan
Yizhak Ben-Shabat
A. Tal
3DPC
120
2
0
14 Aug 2023
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images
  with Free Attention Masks
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks
David Junhao Zhang
Mutian Xu
Chuhui Xue
Wenqing Zhang
Xiaoguang Han
Song Bai
Mike Zheng Shou
DiffM
133
6
0
13 Aug 2023
SimMatchV2: Semi-Supervised Learning with Graph Consistency
SimMatchV2: Semi-Supervised Learning with Graph Consistency
Mingkai Zheng
Shan You
Lang Huang
Chen Luo
Fei Wang
Chao Qian
Chang Xu
SSL
79
11
0
13 Aug 2023
Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh
  Reconstruction
Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction
Hyeongjin Nam
Daniel Sungho Jung
Y. Oh
Kyoung Mu Lee
3DH
105
17
0
12 Aug 2023
Self-Supervised Pre-Training with Contrastive and Masked Autoencoder
  Methods for Dealing with Small Datasets in Deep Learning for Medical Imaging
Self-Supervised Pre-Training with Contrastive and Masked Autoencoder Methods for Dealing with Small Datasets in Deep Learning for Medical Imaging
Daniel Wolf
Tristan Payer
C. Lisson
C. Lisson
Meinrad Beer
Michael Götz
Timo Ropinski
109
18
0
12 Aug 2023
TongueSAM: An Universal Tongue Segmentation Model Based on SAM with
  Zero-Shot
TongueSAM: An Universal Tongue Segmentation Model Based on SAM with Zero-Shot
Shan Cao
Qunsheng Ruan
Qingfeng Wu
VLM
87
12
0
12 Aug 2023
Neural Latent Aligner: Cross-trial Alignment for Learning
  Representations of Complex, Naturalistic Neural Data
Neural Latent Aligner: Cross-trial Alignment for Learning Representations of Complex, Naturalistic Neural Data
Cheol Jun Cho
Edward F. Chang
Gopala K. Anumanchipalli
80
7
0
12 Aug 2023
Towards Packaging Unit Detection for Automated Palletizing Tasks
Towards Packaging Unit Detection for Automated Palletizing Tasks
Markus Völk
Kilian Kleeberger
Werner Kraus
Richard Bormann
43
0
0
11 Aug 2023
FoodSAM: Any Food Segmentation
FoodSAM: Any Food Segmentation
Xing Lan
Jiayi Lyu
Han Jiang
Kunkun Dong
Zehai Niu
Yi Zhang
Jian Xue
VLM
94
28
0
11 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
80
10
0
10 Aug 2023
Masked Diffusion as Self-supervised Representation Learner
Masked Diffusion as Self-supervised Representation Learner
Zixuan Pan
Jianxu Chen
Yi Shi
MedImDiffM
77
10
0
10 Aug 2023
Spatio-Temporal Encoding of Brain Dynamics with Surface Masked
  Autoencoders
Spatio-Temporal Encoding of Brain Dynamics with Surface Masked Autoencoders
Simon Dahan
Logan Z. J. Williams
Yourong Guo
Daniel Rueckert
E. C. Robinson
73
0
0
10 Aug 2023
Adaptive Low Rank Adaptation of Segment Anything to Salient Object
  Detection
Adaptive Low Rank Adaptation of Segment Anything to Salient Object Detection
Rui-Qing Cui
Siyuan He
Shi Qiu
VLM
55
5
0
10 Aug 2023
Geometric Learning-Based Transformer Network for Estimation of
  Segmentation Errors
Geometric Learning-Based Transformer Network for Estimation of Segmentation Errors
S. Sree
Mohammad Al Fahim
Keerthi Ram
M. Sivaprakasam
MedIm
26
0
0
09 Aug 2023
Robust Object Modeling for Visual Tracking
Robust Object Modeling for Visual Tracking
Y. Cai
Jie Liu
Jie Tang
Gangshan Wu
77
65
0
09 Aug 2023
MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner
  for Open-World Semantic Segmentation
MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
Kaixin Cai
Pengzhen Ren
Yi Zhu
Hang Xu
Jian-zhuo Liu
Changlin Li
Guangrun Wang
Xiaodan Liang
VLM
81
15
0
09 Aug 2023
PETformer: Long-term Time Series Forecasting via Placeholder-enhanced
  Transformer
PETformer: Long-term Time Series Forecasting via Placeholder-enhanced Transformer
Shengsheng Lin
Weiwei Lin
Wentai Wu
Song Wang
Yongxiang Wang
AI4TS
81
21
0
09 Aug 2023
Self-supervised Learning of Rotation-invariant 3D Point Set Features
  using Transformer and its Self-distillation
Self-supervised Learning of Rotation-invariant 3D Point Set Features using Transformer and its Self-distillation
T. Furuya
Zhoujie Chen
Ryutarou Ohbuchi
Zhenzhong Kuang
3DPC
62
2
0
09 Aug 2023
Temporal DINO: A Self-supervised Video Strategy to Enhance Action
  Prediction
Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction
Izzeddin Teeti
Rongali Sai Bhargav
Vivek Singh
Andrew Bradley
Biplab Banerjee
Fabio Cuzzolin
71
3
0
08 Aug 2023
Improving Medical Image Classification in Noisy Labels Using Only
  Self-supervised Pretraining
Improving Medical Image Classification in Noisy Labels Using Only Self-supervised Pretraining
Bidur Khanal
Binod Bhattarai
Bishesh Khanal
Cristian A. Linte
NoLa
94
8
0
08 Aug 2023
Unsupervised Camouflaged Object Segmentation as Domain Adaptation
Unsupervised Camouflaged Object Segmentation as Domain Adaptation
Yi Zhang
Chengyi Wu
74
3
0
08 Aug 2023
LEFormer: A Hybrid CNN-Transformer Architecture for Accurate Lake
  Extraction from Remote Sensing Imagery
LEFormer: A Hybrid CNN-Transformer Architecture for Accurate Lake Extraction from Remote Sensing Imagery
Ben Chen
Xuechao Zou
Yu-an Zhang
Jiayu Li
Kaihang Li
Junliang Xing
Pin Tao
ViT
54
13
0
08 Aug 2023
3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment
3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment
Ziyu Zhu
Xiaojian Ma
Yixin Chen
Zhidong Deng
Siyuan Huang
Qing Li
LM&Ro
90
123
0
08 Aug 2023
Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval
Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval
Yi Bin
Haoxuan Li
Yahui Xu
Xing Xu
Yang Yang
Heng Tao Shen
VOS
73
20
0
08 Aug 2023
Exploring Transformers for Open-world Instance Segmentation
Exploring Transformers for Open-world Instance Segmentation
Jiannan Wu
Yi Jiang
B. Yan
Huchuan Lu
Zehuan Yuan
Ping Luo
ViT
76
6
0
08 Aug 2023
Prompted Contrast with Masked Motion Modeling: Towards Versatile 3D
  Action Representation Learning
Prompted Contrast with Masked Motion Modeling: Towards Versatile 3D Action Representation Learning
Jiahang Zhang
Lilang Lin
Jiaying Liu
SSL
82
20
0
08 Aug 2023
SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition
SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition
Tianlin Li
Zong-Yao Wu
Yao Rong
Lin Zhu
Bowei Jiang
Jin Tang
Yonghong Tian
ViT
124
19
0
08 Aug 2023
On genuine invariance learning without weight-tying
On genuine invariance learning without weight-tying
A. Moskalev
A. Sepliarskaia
Erik J. Bekkers
A. Smeulders
CMLOOD
74
9
0
07 Aug 2023
AdaptiveSAM: Towards Efficient Tuning of SAM for Surgical Scene
  Segmentation
AdaptiveSAM: Towards Efficient Tuning of SAM for Surgical Scene Segmentation
Jay N. Paranjape
Nithin Gopalakrishnan Nair
S. Sikder
S. Vedula
Vishal M. Patel
MedIm
79
44
0
07 Aug 2023
Communication-Efficient Framework for Distributed Image Semantic
  Wireless Transmission
Communication-Efficient Framework for Distributed Image Semantic Wireless Transmission
Bingyan Xie
Yongpeng Wu
Yuxuan Shi
Derrick Wing Kwan Ng
Wenjun Zhang
101
13
0
07 Aug 2023
Scaling may be all you need for achieving human-level object recognition
  capacity with human-like visual experience
Scaling may be all you need for achieving human-level object recognition capacity with human-like visual experience
Emin Orhan
55
3
0
07 Aug 2023
Learning Concise and Descriptive Attributes for Visual Recognition
Learning Concise and Descriptive Attributes for Visual Recognition
Andy Yan
Yu Wang
Yiwu Zhong
Chengyu Dong
Zexue He
Yujie Lu
William Wang
Jingbo Shang
Julian McAuley
VLM
119
64
0
07 Aug 2023
Previous
123...596061...949596
Next