ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,778 papers shown
Title
A Closer Look at Benchmarking Self-Supervised Pre-training with Image
  Classification
A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification
Markus Marks
Manuel Knott
Neehar Kondapaneni
Elijah Cole
T. Defraeye
Fernando Pérez-Cruz
Pietro Perona
SSL
132
5
0
16 Jul 2024
This Probably Looks Exactly Like That: An Invertible Prototypical
  Network
This Probably Looks Exactly Like That: An Invertible Prototypical Network
Zachariah Carmichael
Timothy Redgrave
Daniel Gonzalez Cedre
Walter J. Scheirer
BDL
122
2
0
16 Jul 2024
Encapsulating Knowledge in One Prompt
Encapsulating Knowledge in One Prompt
Qi Li
Runpeng Yu
Xinchao Wang
VLMKELM
76
3
0
16 Jul 2024
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language
  Large Models
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju
Haicheng Wang
Haozhe Cheng
Xu Chen
Zhonghua Zhai
Weilin Huang
Jinsong Lan
Shuai Xiao
Bo Zheng
VLM
96
6
0
16 Jul 2024
Global atmospheric data assimilation with multi-modal masked
  autoencoders
Global atmospheric data assimilation with multi-modal masked autoencoders
T. Vandal
Kate Duffy
Daniel J. McDuff
Yoni Nachmany
Chris Hartshorn
AI4ClAI4CE
50
4
0
16 Jul 2024
Mask-guided cross-image attention for zero-shot in-silico histopathologic image generation with a diffusion model
Mask-guided cross-image attention for zero-shot in-silico histopathologic image generation with a diffusion model
Dominik Winter
Nicolas Triltsch
Marco Rosati
Anatoliy Shumilov
Ziya Kokaragac
...
T. Padel
L. S. Monasor
Ross Hill
Markus Schick
N. Brieu
DiffMMedIm
132
3
0
16 Jul 2024
Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of
  Vision Transformers for Medical Image Classification
Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification
Naif Alkhunaizi
Faris Almalik
Rouqaiah Al-Refai
Muzammal Naseer
Karthik Nandakumar
MedIm
109
2
0
16 Jul 2024
Cross-Modal Augmentation for Few-Shot Multimodal Fake News Detection
Cross-Modal Augmentation for Few-Shot Multimodal Fake News Detection
Ye Jiang
Taihang Wang
Xiaoman Xu
Yimin Wang
Xingyi Song
Diana Maynard
94
2
0
16 Jul 2024
AU-vMAE: Knowledge-Guide Action Units Detection via Video Masked
  Autoencoder
AU-vMAE: Knowledge-Guide Action Units Detection via Video Masked Autoencoder
Qiaoqiao Jin
Rui Shi
Yishun Dou
Bingbing Ni
CVBM
87
0
0
16 Jul 2024
EndoFinder: Online Image Retrieval for Explainable Colorectal Polyp
  Diagnosis
EndoFinder: Online Image Retrieval for Explainable Colorectal Polyp Diagnosis
Ruijie Yang
Yan Zhu
Peiyao Fu
Yizhe Zhang
Zhihua Wang
Quanlin Li
Pinghong Zhou
Xian Yang
Shuo Wang
MedIm
57
0
0
16 Jul 2024
TCFormer: Visual Recognition via Token Clustering Transformer
TCFormer: Visual Recognition via Token Clustering Transformer
Wang Zeng
Sheng Jin
Lumin Xu
Wentao Liu
Chao Qian
Wanli Ouyang
Ping Luo
Xiaogang Wang
77
5
0
16 Jul 2024
COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
Liu He
Daniel G. Aliaga
AI4TS
94
9
0
16 Jul 2024
FoodMem: Near Real-time and Precise Food Video Segmentation
FoodMem: Near Real-time and Precise Food Video Segmentation
Ahmad AlMughrabi
Adrián Galán
Ricardo Marques
Petia Radeva
VOS
99
2
0
16 Jul 2024
Efficient Unsupervised Visual Representation Learning with Explicit
  Cluster Balancing
Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing
Ioannis Maniadis Metaxas
Georgios Tzimiropoulos
Ioannis Patras
SSL
109
0
0
15 Jul 2024
R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection
R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection
Zheyuan Zhou
Le Wang
N. Fang
Zili Wang
Le-miao Qiu
Shuyou Zhang
85
15
0
15 Jul 2024
Joint-Embedding Predictive Architecture for Self-Supervised Learning of
  Mask Classification Architecture
Joint-Embedding Predictive Architecture for Self-Supervised Learning of Mask Classification Architecture
Donghee Kim
Sungduk Cho
Hyeonwoo Cho
Chanmin Park
Jinyoung Kim
Won Hwa Kim
96
0
0
15 Jul 2024
Learning Natural Consistency Representation for Face Forgery Video
  Detection
Learning Natural Consistency Representation for Face Forgery Video Detection
Daichi Zhang
Zihao Xiao
Shikun Li
Fanzhao Lin
Jianmin Li
Shiming Ge
CVBM
103
13
0
15 Jul 2024
Omni-Dimensional Frequency Learner for General Time Series Analysis
Omni-Dimensional Frequency Learner for General Time Series Analysis
Xianing Chen
Hanting Chen
Hailin Hu
AI4TS
75
0
0
15 Jul 2024
Representation Learning and Identity Adversarial Training for Facial Behavior Understanding
Representation Learning and Identity Adversarial Training for Facial Behavior Understanding
Mang Ning
A. A. Salah
Itir Onal Ertugrul
CVBM
178
5
0
15 Jul 2024
When Heterophily Meets Heterogeneity: Challenges and a New Large-Scale Graph Benchmark
When Heterophily Meets Heterogeneity: Challenges and a New Large-Scale Graph Benchmark
Junhong Lin
Xiaojie Guo
Shuaicheng Zhang
Dawei Zhou
Yada Zhu
71
1
0
15 Jul 2024
Pre-training with Fractional Denoising to Enhance Molecular Property
  Prediction
Pre-training with Fractional Denoising to Enhance Molecular Property Prediction
Yuyan Ni
Shikun Feng
Xin Hong
Yuancheng Sun
Wei-Ying Ma
Zhiming Ma
Qiwei Ye
Yanyan Lan
AI4CE
81
18
0
14 Jul 2024
RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D
  LiDAR Segmentation
RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation
Li Li
Hubert P. H. Shum
T. Breckon
3DPC
118
11
0
14 Jul 2024
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation
  Models
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Xin-Jian Wu
Rui-Song Zhang
Jie Qin
Shijie Ma
Cheng-Lin Liu
VLM
93
1
0
14 Jul 2024
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model
  and Benchmark Dataset
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang
Wang Zeng
Sheng Jin
Chao Qian
Ping Luo
Wentao Liu
80
6
0
14 Jul 2024
Pre-training Point Cloud Compact Model with Partial-aware Reconstruction
Pre-training Point Cloud Compact Model with Partial-aware Reconstruction
Yaohua Zha
Yanzi Wang
Tao Dai
Shu-Tao Xia
101
0
0
12 Jul 2024
On the Role of Discrete Tokenization in Visual Representation Learning
On the Role of Discrete Tokenization in Visual Representation Learning
Tianqi Du
Yifei Wang
Yisen Wang
106
7
0
12 Jul 2024
Textual Query-Driven Mask Transformer for Domain Generalized
  Segmentation
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak
Byeongju Woo
Sunghwan Kim
Dae-Hwan Kim
Hoseong Kim
134
5
0
12 Jul 2024
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on
  Robustness
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Honghao Chen
Yurong Zhang
Xiaokun Feng
Xiangxiang Chu
Kaiqi Huang
AAML
83
6
0
12 Jul 2024
Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining
  on Chest CT
Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining on Chest CT
Jie Zheng
Ru Wen
Haiqin Hu
Lina Wei
Kui Su
Wei Chen
Chen Liu
Jun Wang
96
1
0
12 Jul 2024
Adaptive Parametric Activation
Adaptive Parametric Activation
Konstantinos Panagiotis Alexandridis
Jiankang Deng
Anh Nguyen
Shan Luo
84
5
0
11 Jul 2024
Paving the way toward foundation models for irregular and unaligned
  Satellite Image Time Series
Paving the way toward foundation models for irregular and unaligned Satellite Image Time Series
Iris Dumeur
Silvia Valero
Jordi Inglada
113
3
0
11 Jul 2024
Enhancing Robustness of Vision-Language Models through Orthogonality
  Learning and Cross-Regularization
Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization
Jinlong Li
Zequn Jie
Elisa Ricci
Lin Ma
N. Sebe
VLM
101
1
0
11 Jul 2024
TIP: Tabular-Image Pre-training for Multimodal Classification with
  Incomplete Data
TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data
Siyi Du
Shaoming Zheng
Yinsong Wang
Wenjia Bai
D. O’Regan
Chen Qin
LMTD
97
5
0
10 Jul 2024
Disentangling Masked Autoencoders for Unsupervised Domain Generalization
Disentangling Masked Autoencoders for Unsupervised Domain Generalization
An Zhang
Han Wang
Xiang Wang
Tat-Seng Chua
100
0
0
10 Jul 2024
SHERL: Synthesizing High Accuracy and Efficient Memory for
  Resource-Limited Transfer Learning
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao
Bo Wan
Xu Jia
Yunzhi Zhuge
Ying Zhang
Huchuan Lu
Long Chen
VLM
93
4
0
10 Jul 2024
Pan-cancer Histopathology WSI Pre-training with Position-aware Masked
  Autoencoder
Pan-cancer Histopathology WSI Pre-training with Position-aware Masked Autoencoder
Kun-Hsuan Wu
Zhiguo Jiang
Kunming Tang
Jun Shi
Fengying Xie
Wei Wang
Haibo Wu
Yushan Zheng
43
1
0
10 Jul 2024
Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation
  Pretraining
Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining
Tianfang Sun
Zhizhong Zhang
Xin Tan
Yanyun Qu
Yuan Xie
109
0
0
10 Jul 2024
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Wentao Zhang
Junliang Guo
Tianyu He
Li Zhao
Linli Xu
Jiang Bian
120
4
0
10 Jul 2024
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Xin Li
Deshui Miao
Zhenyu He
Yansen Wang
Huchuan Lu
Ming-Hsuan Yang
VOS
169
4
0
10 Jul 2024
Dataset Quantization with Active Learning based Adaptive Sampling
Dataset Quantization with Active Learning based Adaptive Sampling
Zhenghao Zhao
Yuzhang Shang
Junyi Wu
Yan Yan
DD
98
5
0
09 Jul 2024
Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer:
  A Disentangled Approach
Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach
Taolin Zhang
Jiawang Bai
Zhihe Lu
Dongze Lian
Genping Wang
Xinchao Wang
Shu-Tao Xia
95
5
0
09 Jul 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
96
1
0
09 Jul 2024
Masked Video and Body-worn IMU Autoencoder for Egocentric Action
  Recognition
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang
Yifei Huang
Ruicong Liu
Yoichi Sato
95
8
0
09 Jul 2024
D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation
  in Breast Cancer Detection from Mammograms
D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms
Tajamul Ashraf
K. Rangarajan
Mohit Gambhir
Richa Gabha
Chetan Arora
MedIm
116
2
0
09 Jul 2024
Asymmetric Mask Scheme for Self-Supervised Real Image Denoising
Asymmetric Mask Scheme for Self-Supervised Real Image Denoising
Xiangyu Liao
Tianheng Zheng
Jiayu Zhong
Pingping Zhang
Chao Ren
115
4
0
09 Jul 2024
A Clinical Benchmark of Public Self-Supervised Pathology Foundation
  Models
A Clinical Benchmark of Public Self-Supervised Pathology Foundation Models
Gabriele Campanella
Shengjia Chen
Ruchika Verma
Jennifer Zeng
A. Stock
...
Kuan-lin Huang
Ricky Kwan
Jane Houldsworth
Adam J. Schoenfeld
Chad M. Vanderbilt
AI4MHOODLM&MA
89
23
0
09 Jul 2024
Reprogramming Distillation for Medical Foundation Models
Reprogramming Distillation for Medical Foundation Models
Yuhang Zhou
Siyuan Du
Haolin Li
Jiangchao Yao
Ya Zhang
Yanfeng Wang
82
2
0
09 Jul 2024
AnatoMask: Enhancing Medical Image Segmentation with
  Reconstruction-guided Self-masking
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
Yuheng Li
Tianyu Luan
Yizhou Wu
Shaoyan Pan
Yenho Chen
Xiaofeng Yang
83
6
0
09 Jul 2024
Noise-Free Explanation for Driving Action Prediction
Noise-Free Explanation for Driving Action Prediction
Hongbo Zhu
Theodor Wulff
R. S. Maharjan
Jinpei Han
Angelo Cangelosi
AAMLFAtt
64
0
0
08 Jul 2024
Unsupervised Fault Detection using SAM with a Moving Window Approach
Unsupervised Fault Detection using SAM with a Moving Window Approach
Ahmed Maged
Herman Shen
51
0
0
08 Jul 2024
Previous
123...262728...949596
Next