Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09886
Cited By
SimMIM: A Simple Framework for Masked Image Modeling
18 November 2021
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SimMIM: A Simple Framework for Masked Image Modeling"
50 / 849 papers shown
Title
Self-Supervised Multimodal Fusion Transformer for Passive Activity Recognition
Armand K. Koupai
M. J. Bocus
Raúl Santos-Rodríguez
Robert Piechocki
Ryan McConville
ViT
30
9
0
15 Aug 2022
MILAN: Masked Image Pretraining on Language Assisted Representation
Zejiang Hou
Fei Sun
Yen-kuang Chen
Yuan Xie
S. Kung
ViT
31
68
0
11 Aug 2022
Understanding Masked Image Modeling via Learning Occlusion Invariant Feature
Xiangwen Kong
Xiangyu Zhang
SSL
32
53
0
08 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
L. Zhang
36
242
0
08 Aug 2022
Frozen CLIP Models are Efficient Video Learners
Ziyi Lin
Shijie Geng
Renrui Zhang
Peng Gao
Gerard de Melo
Xiaogang Wang
Jifeng Dai
Yu Qiao
Hongsheng Li
CLIP
VLM
16
200
0
06 Aug 2022
Masked Vision and Language Modeling for Multi-modal Representation Learning
Gukyeong Kwon
Zhaowei Cai
Avinash Ravichandran
Erhan Bas
Rahul Bhotika
Stefano Soatto
36
67
0
03 Aug 2022
SdAE: Self-distillated Masked Autoencoder
Yabo Chen
Yuchen Liu
Dongsheng Jiang
Xiaopeng Zhang
Wenrui Dai
H. Xiong
Qi Tian
ViT
26
70
0
31 Jul 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
57
71
0
30 Jul 2022
Contrastive Masked Autoencoders are Stronger Vision Learners
Zhicheng Huang
Xiaojie Jin
Cheng Lu
Qibin Hou
Mingg-Ming Cheng
Dongmei Fu
Xiaohui Shen
Jiashi Feng
50
148
0
27 Jul 2022
V
2
^2
2
L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval
Wenhao Wang
Yifan Sun
Zongxin Yang
Yi Yang
VLM
24
3
0
26 Jul 2022
MAR: Masked Autoencoders for Efficient Action Recognition
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Xiang Wang
Yuehuang Wang
Yiliang Lv
Changxin Gao
Nong Sang
32
42
0
24 Jul 2022
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
22
75
0
14 Jul 2022
iColoriT: Towards Propagating Local Hint to the Right Region in Interactive Colorization by Leveraging Vision Transformer
Jooyeol Yun
Sanghyeon Lee
Minho Park
Jaegul Choo
ViT
17
2
0
14 Jul 2022
Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain
Tong Zhang
Peng Gao
Hao-Chen Dong
Zhuang Yin
Guanqun Wang
Wei Zhang
He Chen
33
33
0
08 Jul 2022
Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds
Georg Hess
Johan Jaxing
Elias Svensson
David Hagerman
Christoffer Petersson
Lennart Svensson
3DPC
ViT
25
33
0
01 Jul 2022
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision
Sanat Ramesh
V. Srivastav
Deepak Alapatt
Tong Yu
Aditya Murali
...
Saurav Sharma
A. Fleurentin
Georgios Exarchakis
Alexandros Karargyris
N. Padoy
23
42
0
01 Jul 2022
Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition
Mingkun Yang
Minghui Liao
Pu Lu
Jing Wang
Shenggao Zhu
Hualin Luo
Qingzhen Tian
X. Bai
SSL
33
55
0
01 Jul 2022
Teach me how to Interpolate a Myriad of Embeddings
Shashanka Venkataramanan
Ewa Kijak
Laurent Amsaleg
Yannis Avrithis
43
2
0
29 Jun 2022
Masked World Models for Visual Control
Younggyo Seo
Danijar Hafner
Hao Liu
Fangchen Liu
Stephen James
Kimin Lee
Pieter Abbeel
OffRL
93
146
0
28 Jun 2022
DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Change Detection
W. G. C. Bandara
Nithin Gopalakrishnan Nair
Vishal M. Patel
DiffM
29
5
0
23 Jun 2022
SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders
Gang Li
Heliang Zheng
Daqing Liu
Chaoyue Wang
Bing-Huang Su
Changwen Zheng
32
124
0
21 Jun 2022
Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders
Chen Min
Xinli Xu
Dawei Zhao
Liang Xiao
Yiming Nie
Bin Dai
3DPC
38
50
0
20 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar
Alaaeldin El-Nouby
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
ViT
37
97
0
16 Jun 2022
Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency
Viraj Prabhu
Sriram Yenamandra
Aaditya K. Singh
Judy Hoffman
36
14
0
16 Jun 2022
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Jiahao Xie
Wei Li
Xiaohang Zhan
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
27
69
0
15 Jun 2022
SERE: Exploring Feature Self-relation for Self-supervised Transformer
Zhong-Yu Li
Shanghua Gao
Ming-Ming Cheng
ViT
MDE
26
14
0
10 Jun 2022
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
31
52
0
09 Jun 2022
Spatial Entropy as an Inductive Bias for Vision Transformers
E. Peruzzo
E. Sangineto
Yahui Liu
Marco De Nadai
Wei Bi
Bruno Lepri
N. Sebe
ViT
MDE
31
1
0
09 Jun 2022
Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks
Jia-Yu Pan
Pan Zhou
Shuicheng Yan
SSL
26
15
0
08 Jun 2022
Masked Unsupervised Self-training for Label-free Image Classification
Junnan Li
Silvio Savarese
Steven C. H. Hoi
VLM
SSL
18
12
0
07 Jun 2022
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
23
21
0
02 Jun 2022
CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping
Junlin Han
L. Petersson
Hongdong Li
Ian Reid
33
9
0
31 May 2022
GMML is All you Need
Sara Atito
Muhammad Awais
J. Kittler
ViT
VLM
46
18
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
64
26
0
30 May 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
51
22
0
28 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian Sun
Weiming Hu
ViT
67
41
0
28 May 2022
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Renrui Zhang
Ziyu Guo
Rongyao Fang
Bingyan Zhao
Dong Wang
Yu Qiao
Hongsheng Li
Peng Gao
3DPC
184
244
0
28 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
22
15
0
28 May 2022
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
Yixuan Wei
Han Hu
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Jianmin Bao
Dong Chen
B. Guo
CLIP
88
124
0
27 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
34
48
0
27 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
35
68
0
26 May 2022
HIRL: A General Framework for Hierarchical Image Representation Learning
Minghao Xu
Yuanfan Guo
Xuanyu Zhu
Jiawen Li
Zhenbang Sun
Jiangtao Tang
Yi Xu
Bingbing Ni
SSL
19
3
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
33
53
0
26 May 2022
Improvements to Self-Supervised Representation Learning for Masked Image Modeling
Jia-ju Mao
Xuesong Yin
Yuan Chang
Honggu Zhou
SSL
27
1
0
21 May 2022
Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)
Jue Jiang
N. Tyagi
K. Tringale
C. Crane
Harini Veeraraghavan
MedIm
36
34
0
20 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
113
73
0
20 May 2022
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
Jintang Li
Ruofan Wu
Wangbin Sun
Liang Chen
Sheng Tian
Liang Zhu
Changhua Meng
Zibin Zheng
Weiqiang Wang
SSL
24
79
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
30
51
0
19 May 2022
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLM
ViT
174
11
0
19 May 2022
Previous
1
2
3
...
15
16
17
Next