ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
RMP: A Random Mask Pretrain Framework for Motion Prediction
RMP: A Random Mask Pretrain Framework for Motion Prediction
Yi Yang
Qingwen Zhang
Thomas Gilles
Nazre Batool
John Folkesson
113
5
0
16 Sep 2023
In-Style: Bridging Text and Uncurated Videos with Style Transfer for
  Text-Video Retrieval
In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval
Nina Shvetsova
Anna Kukleva
Bernt Schiele
Hilde Kuehne
DiffM
79
4
0
16 Sep 2023
AV-MaskEnhancer: Enhancing Video Representations through Audio-Visual
  Masked Autoencoder
AV-MaskEnhancer: Enhancing Video Representations through Audio-Visual Masked Autoencoder
Xingjian Diao
Ming Cheng
Shitong Cheng
VGen
92
10
0
15 Sep 2023
Compositional Foundation Models for Hierarchical Planning
Compositional Foundation Models for Hierarchical Planning
Anurag Ajay
Seung-Jun Han
Yilun Du
Shaung Li
Abhi Gupta
Tommi Jaakkola
Josh Tenenbaum
L. Kaelbling
Akash Srivastava
Pulkit Agrawal
LRM
131
71
0
15 Sep 2023
IHT-Inspired Neural Network for Single-Snapshot DOA Estimation with
  Sparse Linear Arrays
IHT-Inspired Neural Network for Single-Snapshot DOA Estimation with Sparse Linear Arrays
Yunqiao Hu
Shunqiao Sun
32
4
0
15 Sep 2023
Understanding the limitations of self-supervised learning for tabular
  anomaly detection
Understanding the limitations of self-supervised learning for tabular anomaly detection
Kimberly T. Mai
Toby O. Davies
Lewis D. Griffin
SSL
78
0
0
15 Sep 2023
A Generative Framework for Self-Supervised Facial Representation
  Learning
A Generative Framework for Self-Supervised Facial Representation Learning
Ruian He
Zhen Xing
Weimin Tan
Bo Yan
DiffM
74
0
0
15 Sep 2023
Leveraging the Power of Data Augmentation for Transformer-based Tracking
Leveraging the Power of Data Augmentation for Transformer-based Tracking
Jie Zhao
Johan Edstedt
Michael Felsberg
D. Wang
Huchuan Lu
ViT
105
4
0
15 Sep 2023
BROW: Better featuRes fOr Whole slide image based on self-distillation
BROW: Better featuRes fOr Whole slide image based on self-distillation
Yuan Wu
Shaojie Li
Zhiqiang Du
Wentao Zhu
55
5
0
15 Sep 2023
Find What You Want: Learning Demand-conditioned Object Attribute Space
  for Demand-driven Navigation
Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation
Hongchen Wang
Andy Guan Hong Chen
Xiaoqi Li
Mingdong Wu
Hao Dong
69
16
0
15 Sep 2023
A Novel Local-Global Feature Fusion Framework for Body-weight Exercise
  Recognition with Pressure Mapping Sensors
A Novel Local-Global Feature Fusion Framework for Body-weight Exercise Recognition with Pressure Mapping Sensors
Davinder Pal Singh
L. Ray
Bo Zhou
Sungho Suh
P. Lukowicz
67
4
0
14 Sep 2023
Virchow: A Million-Slide Digital Pathology Foundation Model
Virchow: A Million-Slide Digital Pathology Foundation Model
Eugene Vorontsov
Alican Bozkurt
Adam Casson
George Shaikovski
Michal Zelechowski
...
Razik Yousfi
Christopher Kanan
David Klimstra
B. Rothrock
Thomas J. Fuchs
MedIm
145
92
0
14 Sep 2023
Co-Salient Object Detection with Semantic-Level Consensus Extraction and
  Dispersion
Co-Salient Object Detection with Semantic-Level Consensus Extraction and Dispersion
Peiran Xu
Yadong Mu
71
7
0
14 Sep 2023
Efficiently Robustify Pre-trained Models
Efficiently Robustify Pre-trained Models
Nishant Jain
Iit Roorkee
Harkirat Singh Behl
Vibhav Vineet
OODVLM
47
3
0
14 Sep 2023
Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image
  Translation for Histopathology Images
Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for Histopathology Images
Zhiyun Song
Penghui Du
Junpeng Yan
Keqin Li
Jianzhong Shou
Maode Lai
Yubo Fan
Yan Xu
97
8
0
14 Sep 2023
EnCodecMAE: Leveraging neural codecs for universal audio representation
  learning
EnCodecMAE: Leveraging neural codecs for universal audio representation learning
L. Pepino
Pablo Riera
Luciana Ferrer
80
5
0
14 Sep 2023
Contrastive Deep Encoding Enables Uncertainty-aware
  Machine-learning-assisted Histopathology
Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology
Nirhoshan Sivaroopan
Chamuditha Jayanga
Chalani Ekanayake
Hasindri Watawana
Jathurshan Pradeepkumar
Mithunjha Anandakumar
Ranga Rodrigo
Chamira U. S. Edussooriya
D. Wadduwage
MedImUQCV
82
0
0
13 Sep 2023
SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
Yiran Qin
Chaoqun Wang
Zijian Kang
Ningning Ma
Zhen Li
Ruimao Zhang
3DPC
100
11
0
13 Sep 2023
Hydra: Multi-head Low-rank Adaptation for Parameter Efficient
  Fine-tuning
Hydra: Multi-head Low-rank Adaptation for Parameter Efficient Fine-tuning
Sanghyeon Kim
Hyunmo Yang
Younghyun Kim
Youngjoon Hong
Eunbyung Park
AI4CE
78
18
0
13 Sep 2023
Attention De-sparsification Matters: Inducing Diversity in Digital
  Pathology Representation Learning
Attention De-sparsification Matters: Inducing Diversity in Digital Pathology Representation Learning
S. Kapse
Srijan Das
Jingwei Zhang
Rajarsi R. Gupta
Joel H. Saltz
Dimitris Samaras
Prateek Prasanna
75
9
0
12 Sep 2023
360$^\circ$ from a Single Camera: A Few-Shot Approach for LiDAR
  Segmentation
360∘^\circ∘ from a Single Camera: A Few-Shot Approach for LiDAR Segmentation
Laurenz Reichardt
Nikolas Ebert
Oliver Wasenmüller
3DPC
128
11
0
12 Sep 2023
A 3M-Hybrid Model for the Restoration of Unique Giant Murals: A Case
  Study on the Murals of Yongle Palace
A 3M-Hybrid Model for the Restoration of Unique Giant Murals: A Case Study on the Murals of Yongle Palace
Jing Yang
Nur Intan Raihana Ruhaiyem
Chichun Zhou
71
1
0
12 Sep 2023
Frequency-Aware Masked Autoencoders for Multimodal Pretraining on
  Biosignals
Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals
Ran Liu
Ellen L. Zippi
Hadi Pouransari
Chris Sandino
Jingping Nie
Hanlin Goh
Erdrin Azemi
Ali Moin
104
12
0
12 Sep 2023
Enhancing Representation in Radiography-Reports Foundation Model: A
  Granular Alignment Algorithm Using Masked Contrastive Learning
Enhancing Representation in Radiography-Reports Foundation Model: A Granular Alignment Algorithm Using Masked Contrastive Learning
Weijian Huang
Cheng Li
Hao Yang
Jiarun Liu
Shanshan Wang
MedIm
90
26
0
12 Sep 2023
Exploration and Comparison of Deep Learning Architectures to Predict
  Brain Response to Realistic Pictures
Exploration and Comparison of Deep Learning Architectures to Predict Brain Response to Realistic Pictures
Riccardo Chimisso
Sathya Burvsić
Paolo Marocco
Giuseppe Vizzari
Dimitri Ognibene
3DV
58
1
0
11 Sep 2023
Toward a Deeper Understanding: RetNet Viewed through Convolution
Toward a Deeper Understanding: RetNet Viewed through Convolution
Chenghao Li
Chaoning Zhang
ViT
75
7
0
11 Sep 2023
Decoupling Common and Unique Representations for Multimodal
  Self-supervised Learning
Decoupling Common and Unique Representations for Multimodal Self-supervised Learning
Yi Wang
C. Albrecht
Nassim Ait Ali Braham
Chenying Liu
Zhitong Xiong
Xiaoxiang Zhu
SSL
97
19
0
11 Sep 2023
Examining the Effect of Pre-training on Time Series Classification
Examining the Effect of Pre-training on Time Series Classification
Jiashu Pu
Shiwei Zhao
Ling Cheng
Yongzhu Chang
Runze Wu
Tangjie Lv
Rongsheng Zhang
AI4TS
115
0
0
11 Sep 2023
HAT: Hybrid Attention Transformer for Image Restoration
HAT: Hybrid Attention Transformer for Image Restoration
Xiangyu Chen
Xintao Wang
Wenlong Zhang
Xiangtao Kong
Yu Qiao
Jiantao Zhou
Chao Dong
101
53
0
11 Sep 2023
DiffAug: Enhance Unsupervised Contrastive Learning with
  Domain-Knowledge-Free Diffusion-based Data Augmentation
DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation
Z. Zang
Hao Luo
Kaidi Wang
Panpan Zhang
F. Wang
Stan. Z Li
Yang You
91
5
0
10 Sep 2023
Self-Supervised Transformer with Domain Adaptive Reconstruction for
  General Face Forgery Video Detection
Self-Supervised Transformer with Domain Adaptive Reconstruction for General Face Forgery Video Detection
Daichi Zhang
Zihao Xiao
Jianmin Li
Shiming Ge
CVBMViT
61
2
0
09 Sep 2023
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual
  Tokenization
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Yang Jin
Kun Xu
Kun Xu
Liwei Chen
Chao Liao
...
Xiaoqiang Lei
Di Zhang
Wenwu Ou
Kun Gai
Yadong Mu
MLLMVLM
82
50
0
09 Sep 2023
ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical
  Image Segmentation
ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation
Xian Lin
Zengqiang Yan
Xianbo Deng
Chuansheng Zheng
Li Yu
ViTMedIm
112
31
0
09 Sep 2023
Video and Synthetic MRI Pre-training of 3D Vision Architectures for
  Neuroimage Analysis
Video and Synthetic MRI Pre-training of 3D Vision Architectures for Neuroimage Analysis
Nikhil J. Dhinagar
Amit Singh
Saket Ozarkar
Ketaki Buwa
Sophia I Thomopoulos
...
Corey McMillan
Chih-Chien Tsai
Jiun-Jie Wang
Yih-Ru Wu
Paul M. Thompson
MedIm
62
2
0
09 Sep 2023
Motif-aware Attribute Masking for Molecular Graph Pre-training
Motif-aware Attribute Masking for Molecular Graph Pre-training
Eric Inae
Gang Liu
Meng Jiang
AI4CE
130
15
0
08 Sep 2023
AMLP:Adaptive Masking Lesion Patches for Self-supervised Medical Image
  Segmentation
AMLP:Adaptive Masking Lesion Patches for Self-supervised Medical Image Segmentation
Xiang-Fei Wang
Ruizhi Wang
Jie Zhou
Thomas Lukasiewicz
Zhenghua Xu
102
0
0
08 Sep 2023
Adapting Self-Supervised Representations to Multi-Domain Setups
Adapting Self-Supervised Representations to Multi-Domain Setups
Neha Kalibhat
Sam Sharpe
Jeremy Goodsitt
Bayan Bruss
Soheil Feizi
71
0
0
07 Sep 2023
CDFSL-V: Cross-Domain Few-Shot Learning for Videos
CDFSL-V: Cross-Domain Few-Shot Learning for Videos
Sarinda Samarasinghe
Mamshad Nayeem Rizve
Navid Kardan
M. Shah
89
11
0
07 Sep 2023
MS-UNet-v2: Adaptive Denoising Method and Training Strategy for Medical
  Image Segmentation with Small Training Data
MS-UNet-v2: Adaptive Denoising Method and Training Strategy for Medical Image Segmentation with Small Training Data
Haoyuan Chen
Yufei Han
Pin Xu
Yanyi Li
Kuan Li
Jianping Yin
113
0
0
07 Sep 2023
DropPos: Pre-Training Vision Transformers by Reconstructing Dropped
  Positions
DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tong Wang
Zhaoxiang Zhang
80
21
0
07 Sep 2023
Toward High Quality Facial Representation Learning
Toward High Quality Facial Representation Learning
Yue Wang
Jinlong Peng
Jiangning Zhang
Ran Yi
Lu Liu
Yabiao Wang
Chengjie Wang
CVBMSSL
96
7
0
07 Sep 2023
Perceptual Quality Assessment of 360$^\circ$ Images Based on Generative
  Scanpath Representation
Perceptual Quality Assessment of 360∘^\circ∘ Images Based on Generative Scanpath Representation
Xiangjie Sui
Hanwei Zhu
Xuelin Liu
Yuming Fang
Shiqi Wang
Zhou Wang
82
6
0
07 Sep 2023
Self-Supervised Masked Digital Elevation Models Encoding for
  Low-Resource Downstream Tasks
Self-Supervised Masked Digital Elevation Models Encoding for Low-Resource Downstream Tasks
Priyam Mazumdar
Aiman Soliman
Volodymyr V. Kindratenko
Luigi Marini
Kenton McHenry
69
0
0
06 Sep 2023
ViewMix: Augmentation for Robust Representation in Self-Supervised
  Learning
ViewMix: Augmentation for Robust Representation in Self-Supervised Learning
A. Das
Agnibh Dasgupta
SSL
59
0
0
06 Sep 2023
Towards Efficient Training with Negative Samples in Visual Tracking
Towards Efficient Training with Negative Samples in Visual Tracking
Qingmao Wei
Bi Zeng
Guotian Zeng
AAML
91
1
0
06 Sep 2023
Learning Vehicle Dynamics from Cropped Image Patches for Robot
  Navigation in Unpaved Outdoor Terrains
Learning Vehicle Dynamics from Cropped Image Patches for Robot Navigation in Unpaved Outdoor Terrains
Jeong Hyun Lee
Jinhyeok Choi
Simo Ryu
Hyunsik Oh
Suyoung Choi
Jemin Hwangbo
41
2
0
06 Sep 2023
DMKD: Improving Feature-based Knowledge Distillation for Object
  Detection Via Dual Masking Augmentation
DMKD: Improving Feature-based Knowledge Distillation for Object Detection Via Dual Masking Augmentation
Guangqi Yang
Yin Tang
Zhijian Wu
Jun Yu Li
Jianhua Xu
Xili Wan
77
4
0
06 Sep 2023
Gene-induced Multimodal Pre-training for Image-omic Classification
Gene-induced Multimodal Pre-training for Image-omic Classification
Ting Jin
Xingran Xie
Renjie Wan
Qingli Li
Yan Wang
AI4CE
84
11
0
06 Sep 2023
Efficient Training for Visual Tracking with Deformable Transformer
Efficient Training for Visual Tracking with Deformable Transformer
Qingmao Wei
Guotian Zeng
Bi Zeng
ViT
145
4
0
06 Sep 2023
Representation Learning for Sequential Volumetric Design Tasks
Representation Learning for Sequential Volumetric Design Tasks
Md Ferdous Alam
Yi Wang
Linh Tran
Chin-Yi Cheng
Jieliang Luo
3DV
95
2
0
05 Sep 2023
Previous
123...565758...949596
Next