ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
Masked Autoencoders are Efficient Continual Federated Learners
Masked Autoencoders are Efficient Continual Federated Learners
Subarnaduti Paul
Lars-Joel Frey
Roshni Kamath
Kristian Kersting
Martin Mundt
CLLFedML
80
2
0
06 Jun 2023
Industrial Anomaly Detection and Localization Using Weakly-Supervised
  Residual Transformers
Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers
Hanxi Li
Jing Wu
Lin Yuanbo Wu
Hao Chen
Deyin Liu
Mingwen Wang
Peng Wang
ViT
106
4
0
06 Jun 2023
Quantifying the Variability Collapse of Neural Networks
Quantifying the Variability Collapse of Neural Networks
Jing-Xue Xu
Haoxiong Liu
94
6
0
06 Jun 2023
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data
Chongyi Zheng
Benjamin Eysenbach
Homer Walke
Patrick Yin
Kuan Fang
Ruslan Salakhutdinov
Sergey Levine
OffRLSSL
92
6
0
06 Jun 2023
Discovering Novel Biological Traits From Images Using Phylogeny-Guided
  Neural Networks
Discovering Novel Biological Traits From Images Using Phylogeny-Guided Neural Networks
Mohannad Elhamod
Mridul Khurana
Harish Babu Manogaran
Josef C. Uyeda
M. Balk
...
Wei-Lun Chao
Chuck Stewart
Daniel Rubenstein
T. Berger-Wolf
Anuj Karpatne
58
7
0
05 Jun 2023
Asymmetric Patch Sampling for Contrastive Learning
Asymmetric Patch Sampling for Contrastive Learning
Chen Shen
Jianzhong Chen
Shu Wang
Hulin Kuang
Jin Liu
Jianxin Wang
SSL
116
5
0
05 Jun 2023
Explore and Exploit the Diverse Knowledge in Model Zoo for Domain
  Generalization
Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization
Yimeng Chen
Tianyang Hu
Fengwei Zhou
Zhenguo Li
Zhiming Ma
77
12
0
05 Jun 2023
Introduction to Latent Variable Energy-Based Models: A Path Towards
  Autonomous Machine Intelligence
Introduction to Latent Variable Energy-Based Models: A Path Towards Autonomous Machine Intelligence
Anna Dawid
Yann LeCun
DRL
108
31
0
05 Jun 2023
Multi-View Representation is What You Need for Point-Cloud Pre-Training
Multi-View Representation is What You Need for Point-Cloud Pre-Training
Siming Yan
Chen Song
Youkang Kong
Qi-Xing Huang
3DPC
134
2
0
05 Jun 2023
Systematic Visual Reasoning through Object-Centric Relational
  Abstraction
Systematic Visual Reasoning through Object-Centric Relational Abstraction
Taylor Webb
S. S. Mondal
Jonathan D. Cohen
OCL
119
26
0
04 Jun 2023
Data Quality in Imitation Learning
Data Quality in Imitation Learning
Suneel Belkhale
Yuchen Cui
Dorsa Sadigh
92
52
0
04 Jun 2023
Training Like a Medical Resident: Context-Prior Learning Toward
  Universal Medical Image Segmentation
Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation
Yunhe Gao
Zhuowei Li
Di Liu
Mu Zhou
Shaoting Zhang
Dimitris N. Metaxas
MedIm
98
13
0
04 Jun 2023
rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for
  Remote Physiological Measurement
rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement
Xin Liu
Yuting Zhang
Zitong Yu
Hao Lu
Huanjing Yue
Jingyu Yang
87
31
0
04 Jun 2023
SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model
Dingyuan Zhang
Dingkang Liang
Hongcheng Yang
Zhikang Zou
Xiaoqing Ye
Yanfeng Guo
Xiang Bai
VLM
110
46
0
04 Jun 2023
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Jianghui Wang
Yuxuan Wang
Dongyan Zhao
Zilong Zheng
100
1
0
04 Jun 2023
Content-aware Token Sharing for Efficient Semantic Segmentation with
  Vision Transformers
Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers
Chenyang Lu
Daan de Geus
Gijs Dubbelman
ViT
132
20
0
03 Jun 2023
Towards Black-box Adversarial Example Detection: A Data
  Reconstruction-based Method
Towards Black-box Adversarial Example Detection: A Data Reconstruction-based Method
Yifei Gao
Zhi Lin
Yunfan Yang
Jitao Sang
AAML
96
4
0
03 Jun 2023
Recent Advances of Local Mechanisms in Computer Vision: A Survey and
  Outlook of Recent Work
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Qiangchang Wang
Yilong Yin
104
0
0
02 Jun 2023
DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Xiuye Gu
Huayu Chen
Jonathan Huang
Abdullah M. Rashwan
Boxin Wang
...
Golnaz Ghiasi
Weicheng Kuo
Huizhong Chen
Liang-Chieh Chen
David A. Ross
ISeg
113
27
0
02 Jun 2023
Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
Zeqiang Lai
Yuchen Duan
Jifeng Dai
Ziheng Li
Ying Fu
Hongsheng Li
Yu Qiao
Wen Wang
DiffM
102
20
0
02 Jun 2023
Unifying (Machine) Vision via Counterfactual World Modeling
Unifying (Machine) Vision via Counterfactual World Modeling
Daniel M. Bear
Kevin T. Feigelis
Honglin Chen
Wanhee Lee
R. Venkatesh
Klemen Kotar
Alex Durango
Daniel L. K. Yamins
VGen
65
14
0
02 Jun 2023
Towards In-context Scene Understanding
Towards In-context Scene Understanding
Ivana Balazevic
David Steiner
Nikhil Parthasarathy
Relja Arandjelović
Olivier J. Hénaff
101
31
0
02 Jun 2023
Evaluating The Robustness of Self-Supervised Representations to
  Background/Foreground Removal
Evaluating The Robustness of Self-Supervised Representations to Background/Foreground Removal
Xavier F. Cadet
Ranya Aloufi
A. Miranville
S. Ahmadi-Abhari
Hamed Haddadi
82
0
0
02 Jun 2023
Masked Autoencoder for Unsupervised Video Summarization
Masked Autoencoder for Unsupervised Video Summarization
Minho Shim
Taeoh Kim
Jinhyung Kim
Dongyoon Wee
56
1
0
02 Jun 2023
Towards Robust GAN-generated Image Detection: a Multi-view Completion
  Representation
Towards Robust GAN-generated Image Detection: a Multi-view Completion Representation
Chi Liu
Tianqing Zhu
Sheng Shen
Wanlei Zhou
AAML
70
8
0
02 Jun 2023
White-Box Transformers via Sparse Rate Reduction
White-Box Transformers via Sparse Rate Reduction
Yaodong Yu
Sam Buchanan
Druv Pai
Tianzhe Chu
Ziyang Wu
Shengbang Tong
B. Haeffele
Yi Ma
ViT
110
87
0
01 Jun 2023
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Chaitanya K. Ryali
Yuan-Ting Hu
Daniel Bolya
Chen Wei
Haoqi Fan
...
Omid Poursaeed
Judy Hoffman
Jitendra Malik
Yanghao Li
Christoph Feichtenhofer
3DH
136
189
0
01 Jun 2023
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual
  Representation Learners
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
Yonglong Tian
Lijie Fan
Phillip Isola
Huiwen Chang
Dilip Krishnan
VLMDiffM
156
153
0
01 Jun 2023
MOSAIC: Masked Optimisation with Selective Attention for Image
  Reconstruction
MOSAIC: Masked Optimisation with Selective Attention for Image Reconstruction
Pamuditha Somarathne
Tharindu Wickremasinghe
Amashi Niwarthana
A. Thieshanthan
Chamira U. S. Edussooriya
D. Wadduwage
144
0
0
01 Jun 2023
DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection
DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection
Rui Shao
Tianxing Wu
Liqiang Nie
Ziwei Liu
80
14
0
01 Jun 2023
Auto-Spikformer: Spikformer Architecture Search
Auto-Spikformer: Spikformer Architecture Search
Kaiwei Che
Zhaokun Zhou
Zhengyu Ma
Wei Fang
Yanqing Chen
Shuaijie Shen
Liuliang Yuan
Yonghong Tian
113
8
0
01 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning
  via RKHS Approximation and Regression
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
123
10
0
01 Jun 2023
Masked Autoencoders with Multi-Window Local-Global Attention Are Better
  Audio Learners
Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners
Sarthak Yadav
Sergios Theodoridis
Lars Kai Hansen
Zheng-Hua Tan
102
9
0
01 Jun 2023
A Novel Driver Distraction Behavior Detection Method Based on
  Self-supervised Learning with Masked Image Modeling
A Novel Driver Distraction Behavior Detection Method Based on Self-supervised Learning with Masked Image Modeling
Yingzhi Zhang
Taiguo Li
Chong Li
Xinghong Zhou
154
11
0
01 Jun 2023
On Masked Pre-training and the Marginal Likelihood
On Masked Pre-training and the Marginal Likelihood
Pablo Moreno-Muñoz
Pol G. Recasens
Søren Hauberg
SSL
61
6
0
01 Jun 2023
Make Pre-trained Model Reversible: From Parameter to Memory Efficient
  Fine-Tuning
Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
Baohao Liao
Shaomu Tan
Christof Monz
KELM
105
30
0
01 Jun 2023
Exploring Open-Vocabulary Semantic Segmentation without Human Labels
Exploring Open-Vocabulary Semantic Segmentation without Human Labels
Jun Chen
Deyao Zhu
Guocheng Qian
Guohao Li
Zhicheng Yan
Chenchen Zhu
Fanyi Xiao
Mohamed Elhoseiny
Sean Culatana
VLM
92
11
0
01 Jun 2023
Affinity-based Attention in Self-supervised Transformers Predicts
  Dynamics of Object Grouping in Humans
Affinity-based Attention in Self-supervised Transformers Predicts Dynamics of Object Grouping in Humans
Hossein Adeli
Seoyoung Ahn
N. Kriegeskorte
G. Zelinsky
ViT
72
5
0
01 Jun 2023
Humans in 4D: Reconstructing and Tracking Humans with Transformers
Humans in 4D: Reconstructing and Tracking Humans with Transformers
Shubham Goel
Georgios Pavlakos
Jathushan Rajasegaran
Angjoo Kanazawa
Jitendra Malik
3DH
103
192
0
31 May 2023
Improving CLIP Training with Language Rewrites
Improving CLIP Training with Language Rewrites
Lijie Fan
Dilip Krishnan
Phillip Isola
Dina Katabi
Yonglong Tian
BDLVLMCLIP
128
179
0
31 May 2023
Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by
  Diminishing Bias
Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
Zhongwei Wan
Che Liu
Mi Zhang
Jie Fu
Benyou Wang
Sibo Cheng
Lei Ma
César Quilodrán-Casas
Rossella Arcucci
115
77
0
31 May 2023
There is more to graphs than meets the eye: Learning universal features
  with self-supervision
There is more to graphs than meets the eye: Learning universal features with self-supervision
L. Das
Sai Munikoti
M. Halappanavar
SSLOOD
81
1
0
31 May 2023
Unsupervised Anomaly Detection in Medical Images Using Masked Diffusion
  Model
Unsupervised Anomaly Detection in Medical Images Using Masked Diffusion Model
H. Iqbal
Umar Khalid
Jing Hua
Chong Chen
DiffMMedIm
94
29
0
31 May 2023
A Survey of Label-Efficient Deep Learning for 3D Point Clouds
A Survey of Label-Efficient Deep Learning for 3D Point Clouds
Aoran Xiao
Xiaoqin Zhang
Ling Shao
Shijian Lu
3DPC
117
24
0
31 May 2023
Augmentation-aware Self-supervised Learning with Conditioned Projector
Augmentation-aware Self-supervised Learning with Conditioned Projector
Marcin Przewike'zlikowski
Mateusz Pyla
Bartosz Zieliñski
Bartlomiej Twardowski
Jacek Tabor
Marek Śmieja
SSL
101
3
0
31 May 2023
Point-GCC: Universal Self-supervised 3D Scene Pre-training via
  Geometry-Color Contrast
Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast
Guo Fan
Zekun Qi
Wenkai Shi
Kaisheng Ma
3DPCSSL
117
10
0
31 May 2023
Contextual Vision Transformers for Robust Representation Learning
Contextual Vision Transformers for Robust Representation Learning
Yu Bao
Theofanis Karaletsos
ViT
47
14
0
30 May 2023
Ambient Diffusion: Learning Clean Distributions from Corrupted Data
Ambient Diffusion: Learning Clean Distributions from Corrupted Data
Giannis Daras
Kulin Shah
Y. Dagan
Aravind Gollakota
A. Dimakis
Adam R. Klivans
DiffM
132
76
0
30 May 2023
UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction
  for Autonomous Driving
UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving
Chen Min
Liang Xiao
Dawei Zhao
Yiming Nie
Bin Dai
116
20
0
30 May 2023
Graph Neural Processes for Spatio-Temporal Extrapolation
Graph Neural Processes for Spatio-Temporal Extrapolation
Junfeng Hu
Yuxuan Liang
Zhencheng Fan
Hongyang Chen
Yu Zheng
Roger Zimmermann
BDL
78
12
0
30 May 2023
Previous
123...656667...949596
Next