ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.06377
  4. Cited By
Masked Autoencoders Are Scalable Vision Learners
v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
    ViTTPM
ArXiv (abs)PDFHTML

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,779 papers shown
Title
You Can Mask More For Extremely Low-Bitrate Image Compression
You Can Mask More For Extremely Low-Bitrate Image Compression
Anqi Li
Feng Li
Jiaxin Han
H. Bai
Runmin Cong
Chunjie Zhang
Ming Wang
Weisi Lin
Yao-Min Zhao
60
2
0
27 Jun 2023
FedET: A Communication-Efficient Federated Class-Incremental Learning
  Framework Based on Enhanced Transformer
FedET: A Communication-Efficient Federated Class-Incremental Learning Framework Based on Enhanced Transformer
Chenghao Liu
Xiaoyang Qu
Jianzong Wang
Jing Xiao
CLLFedML
85
26
0
27 Jun 2023
MAE-GEBD:Winning the CVPR'2023 LOVEU-GEBD Challenge
MAE-GEBD:Winning the CVPR'2023 LOVEU-GEBD Challenge
Yuanxi Sun
Ruifei He
Youzeng Li
Zuwei Huang
Feng Hu
Xu Cheng
Jie Tang
42
1
0
27 Jun 2023
MIMIC: Masked Image Modeling with Image Correspondences
MIMIC: Masked Image Modeling with Image Correspondences
Kalyani Marathe
Mahtab Bigverdi
Nishat Khan
Tuhin Kundu
Patrick Howe
Sharan Ranjit S
Anand Bhattad
Aniruddha Kembhavi
Linda G. Shapiro
Ranjay Krishna
67
0
0
27 Jun 2023
Learning to Modulate pre-trained Models in RL
Learning to Modulate pre-trained Models in RL
Thomas Schmied
M. Hofmarcher
Fabian Paischer
Razvan Pascanu
Sepp Hochreiter
CLLOffRL
105
18
0
26 Jun 2023
ViNT: A Foundation Model for Visual Navigation
ViNT: A Foundation Model for Visual Navigation
Dhruv Shah
A. Sridhar
Nitish Dashora
Kyle Stachowicz
Kevin Black
Noriaki Hirose
Sergey Levine
LM&Ro
80
147
0
26 Jun 2023
Learning with Difference Attention for Visually Grounded Self-supervised
  Representations
Learning with Difference Attention for Visually Grounded Self-supervised Representations
Aishwarya Agarwal
Srikrishna Karanam
Balaji Vasan Srinivasan
88
1
0
26 Jun 2023
ParameterNet: Parameters Are All You Need
ParameterNet: Parameters Are All You Need
Kai Han
Yunhe Wang
Jianyuan Guo
Enhua Wu
VLMAI4CE
75
31
0
26 Jun 2023
Masked conditional variational autoencoders for chromosome straightening
Masked conditional variational autoencoders for chromosome straightening
Jingxiong Li
S. Zheng
Zhongyi Shui
Shichuan Zhang
Lin Yang
...
Honglin Li
Y. Ye
P. V. van Ooijen
Kang Li
Lin Yang
CML
49
3
0
25 Jun 2023
Exploring Data Redundancy in Real-world Image Classification through
  Data Selection
Exploring Data Redundancy in Real-world Image Classification through Data Selection
Zhenyu Tang
Shaoting Zhang
Xiaosong Wang
44
3
0
25 Jun 2023
Waypoint Transformer: Reinforcement Learning via Supervised Learning
  with Intermediate Targets
Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets
Anirudhan Badrinath
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
102
19
0
24 Jun 2023
How to Efficiently Adapt Large Segmentation Model(SAM) to Medical Images
How to Efficiently Adapt Large Segmentation Model(SAM) to Medical Images
Xinrong Hu
Xiaowei Xu
Yi Shi
VLMMedIm
53
64
0
23 Jun 2023
ProRes: Exploring Degradation-aware Visual Prompt for Universal Image
  Restoration
ProRes: Exploring Degradation-aware Visual Prompt for Universal Image Restoration
Jiaqi Ma
Tianheng Cheng
Guoli Wang
Qian Zhang
Xinggang Wang
Lefei Zhang
DiffMVLM
81
48
0
23 Jun 2023
Patch-Level Contrasting without Patch Correspondence for Accurate and
  Dense Contrastive Representation Learning
Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning
Shaofeng Zhang
Feng Zhu
Rui Zhao
Junchi Yan
89
18
0
23 Jun 2023
Ladder Fine-tuning approach for SAM integrating complementary network
Ladder Fine-tuning approach for SAM integrating complementary network
Shurong Chai
R. Jain
Shiyu Teng
Jiaqing Liu
Yinhao Li
T. Tateyama
Yen-wei Chen
MedIm
81
33
0
22 Jun 2023
Inter-Instance Similarity Modeling for Contrastive Learning
Inter-Instance Similarity Modeling for Contrastive Learning
Chen Shen
Dawei Liu
Hao Tang
Zhe Qu
Jianxin Wang
SSL
79
7
0
21 Jun 2023
Annotating Ambiguous Images: General Annotation Strategy for
  High-Quality Data with Real-World Biomedical Validation
Annotating Ambiguous Images: General Annotation Strategy for High-Quality Data with Real-World Biomedical Validation
Lars Schmarje
Vasco Grossmann
Claudius Zelenka
Johannes Brunger
Reinhard Koch
82
1
0
21 Jun 2023
ViTEraser: Harnessing the Power of Vision Transformers for Scene Text
  Removal with SegMIM Pretraining
ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining
Dezhi Peng
Chongyu Liu
Yuliang Liu
Lianwen Jin
DiffM
81
10
0
21 Jun 2023
Task-Robust Pre-Training for Worst-Case Downstream Adaptation
Task-Robust Pre-Training for Worst-Case Downstream Adaptation
Jianghui Wang
Cheng Yang
Xingyu Xie
Cong Fang
Zhouchen Lin
OOD
84
0
0
21 Jun 2023
Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly
  Detectors
Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
Nicolae-Cătălin Ristea
Florinel-Alin Croitoru
Radu Tudor Ionescu
Marius Popescu
Fahad Shahbaz Khan
M. Shah
ViT
121
23
0
21 Jun 2023
Continual Learners are Incremental Model Generalizers
Continual Learners are Incremental Model Generalizers
Jaehong Yoon
Sung Ju Hwang
Yu Cao
CLL
86
5
0
21 Jun 2023
End-to-End Augmentation Hyperparameter Tuning for Self-Supervised Anomaly Detection
End-to-End Augmentation Hyperparameter Tuning for Self-Supervised Anomaly Detection
Jaemin Yoo
Lingxiao Zhao
Leman Akoglu
109
4
0
21 Jun 2023
Multi-task Collaborative Pre-training and Individual-adaptive-tokens
  Fine-tuning: A Unified Framework for Brain Representation Learning
Multi-task Collaborative Pre-training and Individual-adaptive-tokens Fine-tuning: A Unified Framework for Brain Representation Learning
Ning Jiang
Gongshu Wang
Tianyi Yan
CLL
64
0
0
20 Jun 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large
  Vision-Language Model for Remote Sensing
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
Zilun Zhang
Tiancheng Zhao
Yulong Guo
Yuxiang Cai
DiffMVLM
168
66
0
20 Jun 2023
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive
  Learning
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning
Cheng Tan
Siyuan Li
Zhangyang Gao
Wen-Cai Guan
Zedong Wang
Zicheng Liu
Lirong Wu
Stan Z. Li
AI4TS
109
63
0
20 Jun 2023
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
Fan Liu
Delong Chen
Zhan-Rong Guan
Xiaocong Zhou
Jiale Zhu
Qiaolin Ye
Liyong Fu
Jun Zhou
VLM
170
229
0
19 Jun 2023
ExpPoint-MAE: Better interpretability and performance for
  self-supervised point cloud transformers
ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers
Ioannis Romanelis
Vlassis Fotis
Konstantinos Moustakas
Adrian Munteanu
ViT3DPC
123
4
0
19 Jun 2023
Virtual Human Generative Model: Masked Modeling Approach for Learning Human Characteristics
Virtual Human Generative Model: Masked Modeling Approach for Learning Human Characteristics
Kenta Oono
Nontawat Charoenphakdee
K. Bito
Zhengyan Gao
Yoshiaki Ota
...
Kohei Hayashi
Yuki Saito
Koki Tsuda
Hiroshi Maruyama
K. Hayashi
165
1
0
19 Jun 2023
RedMotion: Motion Prediction via Redundancy Reduction
RedMotion: Motion Prediction via Redundancy Reduction
Royden Wagner
Omer Sahin Tas
Marvin Klemp
Carlos Fernandez Lopez
Christoph Stiller
198
8
0
19 Jun 2023
Enhanced Masked Image Modeling for Analysis of Dental Panoramic
  Radiographs
Enhanced Masked Image Modeling for Analysis of Dental Panoramic Radiographs
A. Almalki
Longin Jan Latecki
MedIm
35
4
0
18 Jun 2023
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for
  Task-Oriented Dialogue
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue
Weihao Zeng
Keqing He
Yejie Wang
Chen Zeng
Jingang Wang
Yunsen Xian
Weiran Xu
58
1
0
17 Jun 2023
ALP: Action-Aware Embodied Learning for Perception
ALP: Action-Aware Embodied Learning for Perception
Xinran Liang
Anthony Han
Wilson Yan
Aditi Raghunathan
Pieter Abbeel
VLM
94
1
0
16 Jun 2023
Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress,
  and Prospects
Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects
Kexin Zhang
Qingsong Wen
Chaoli Zhang
Rongyao Cai
Ming Jin
...
James Y. Zhang
Yuxuan Liang
Guansong Pang
Dongjin Song
Shirui Pan
AI4TS
229
118
0
16 Jun 2023
Robot Learning with Sensorimotor Pre-training
Robot Learning with Sensorimotor Pre-training
Ilija Radosavovic
Baifeng Shi
Letian Fu
Ken Goldberg
Trevor Darrell
Jitendra Malik
SSLLM&Ro
80
51
0
16 Jun 2023
Group Orthogonalization Regularization For Vision Models Adaptation and
  Robustness
Group Orthogonalization Regularization For Vision Models Adaptation and Robustness
Yoav Kurtz
Noga Bar
Raja Giryes
60
0
0
16 Jun 2023
MedFMC: A Real-world Dataset and Benchmark For Foundation Model
  Adaptation in Medical Image Classification
MedFMC: A Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification
Dequan Wang
Xiaosong Wang
Lilong Wang
Mengzhang Li
Q. Da
...
Qi Duan
Jie Zhao
Kang Li
Yu Qiao
Shaoting Zhang
VLMMedIm
90
36
0
16 Jun 2023
Evaluating the Robustness of Text-to-image Diffusion Models against
  Real-world Attacks
Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks
Hongcheng Gao
Hao Zhang
Yinpeng Dong
Zhijie Deng
AAML
109
23
0
16 Jun 2023
Sample-Efficient Learning of Novel Visual Concepts
Sample-Efficient Learning of Novel Visual Concepts
Sarthak Bhagat
Simon Stepputtis
Joseph Campbell
Katia Sycara
90
9
0
15 Jun 2023
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models
You-Chen Liu
Lingdong Kong
Jun Cen
Runnan Chen
Wenwei Zhang
Liang Pan
Kai-xiang Chen
Ziwei Liu
80
91
0
15 Jun 2023
Rosetta Neurons: Mining the Common Units in a Model Zoo
Rosetta Neurons: Mining the Common Units in a Model Zoo
Amil Dravid
Yossi Gandelsman
Alexei A. Efros
Assaf Shocher
83
31
0
15 Jun 2023
DreamSim: Learning New Dimensions of Human Visual Similarity using
  Synthetic Data
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data
Stephanie Fu
Netanel Y. Tamir
Shobhita Sundaram
Lucy Chai
Richard Y. Zhang
Tali Dekel
Phillip Isola
EGVM
100
123
0
15 Jun 2023
Human Preference Score v2: A Solid Benchmark for Evaluating Human
  Preferences of Text-to-Image Synthesis
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Xiaoshi Wu
Yiming Hao
Keqiang Sun
Yixiong Chen
Feng Zhu
Rui Zhao
Hongsheng Li
145
316
0
15 Jun 2023
Learnable Weight Initialization for Volumetric Medical Image
  Segmentation
Learnable Weight Initialization for Volumetric Medical Image Segmentation
Shahina Kunhimon
Abdelrahman M. Shaker
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
104
1
0
15 Jun 2023
Fast Training of Diffusion Models with Masked Transformers
Fast Training of Diffusion Models with Masked Transformers
Hongkai Zheng
Weili Nie
Arash Vahdat
Anima Anandkumar
DiffM
121
73
0
15 Jun 2023
Radars for Autonomous Driving: A Review of Deep Learning Methods and
  Challenges
Radars for Autonomous Driving: A Review of Deep Learning Methods and Challenges
Arvind Srivastav
S. Mandal
124
34
0
15 Jun 2023
Robustness Analysis on Foundational Segmentation Models
Robustness Analysis on Foundational Segmentation Models
Madeline Chantry Schiappa
Shehreen Azad
V. Sachidanand
Yunhao Ge
O. Mikšík
Yogesh S Rawat
Vibhav Vineet
OODVLMAAML
73
9
0
15 Jun 2023
Text Promptable Surgical Instrument Segmentation with Vision-Language
  Models
Text Promptable Surgical Instrument Segmentation with Vision-Language Models
Zijian Zhou
Oluwatosin O. Alabi
Meng Wei
Tom Vercauteren
Miaojing Shi
MedIm
101
25
0
15 Jun 2023
Exploring the Application of Large-scale Pre-trained Models on Adverse
  Weather Removal
Exploring the Application of Large-scale Pre-trained Models on Adverse Weather Removal
Zhentao Tan
Yue-bo Wu
Qiankun Liu
Qi Chu
Le Lu
Jieping Ye
Nenghai Yu
95
13
0
15 Jun 2023
DocumentNet: Bridging the Data Gap in Document Pre-Training
DocumentNet: Bridging the Data Gap in Document Pre-Training
Lijun Yu
Jin Miao
Xiaoyu Sun
Jiayi Chen
Alexander G. Hauptmann
H. Dai
Wei Wei
42
3
0
15 Jun 2023
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech
  Representation
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation
Ziyang Ma
Zhisheng Zheng
Guanrou Yang
Yu Wang
Chuxu Zhang
Xie Chen
SSL
72
9
0
15 Jun 2023
Previous
123...636465...949596
Next