Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09886
Cited By
SimMIM: A Simple Framework for Masked Image Modeling
18 November 2021
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SimMIM: A Simple Framework for Masked Image Modeling"
50 / 849 papers shown
Title
Distilling Representations from GAN Generator via Squeeze and Span
Yu Yang
Xiaotian Cheng
Chang-rui Liu
Hakan Bilen
Xiang Ji
GAN
31
0
0
06 Nov 2022
Late Fusion with Triplet Margin Objective for Multimodal Ideology Prediction and Analysis
Changyuan Qiu
Winston Wu
Xinliang Frederick Zhang
Lu Wang
22
1
0
04 Nov 2022
Could Giant Pretrained Image Models Extract Universal Representations?
Yutong Lin
Ze Liu
Zheng-Wei Zhang
Han Hu
Nanning Zheng
Stephen Lin
Yue Cao
VLM
51
9
0
03 Nov 2022
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
21
1
0
03 Nov 2022
RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representation from X-Ray Images
Guang Li
Ren Togo
Takahiro Ogawa
Miki Haseyama
24
0
0
01 Nov 2022
Self-supervised Character-to-Character Distillation for Text Recognition
Tongkun Guan
Wei Shen
Xuehang Yang
Qi Feng
Zekun Jiang
Xiaokang Yang
45
26
0
01 Nov 2022
Pixel-Wise Contrastive Distillation
Junqiang Huang
Zichao Guo
42
4
0
01 Nov 2022
A simple, efficient and scalable contrastive masked autoencoder for learning visual representations
Shlok Kumar Mishra
Joshua Robinson
Huiwen Chang
David Jacobs
Aaron Sarna
Aaron Maschinot
Dilip Krishnan
DiffM
43
30
0
30 Oct 2022
Learning Explicit Object-Centric Representations with Vision Transformers
Oscar Vikström
Alexander Ilin
OCL
ViT
38
4
0
25 Oct 2022
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao
Yutong Bai
Alan Yuille
Zongwei Zhou
MedIm
ViT
37
59
0
23 Oct 2022
Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present and Future
Guo-Jun Qi
M. Shah
SSL
23
8
0
23 Oct 2022
i-MAE: Are Latent Representations in Masked Autoencoders Linearly Separable?
Kevin Zhang
Zhiqiang Shen
20
8
0
20 Oct 2022
MixMask: Revisiting Masking Strategy for Siamese ConvNets
Kirill Vishniakov
Eric P. Xing
Zhiqiang Shen
18
0
0
20 Oct 2022
Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs
A. Almalki
Longin Jan Latecki
MedIm
22
14
0
20 Oct 2022
Towards Sustainable Self-supervised Learning
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
CLL
48
7
0
20 Oct 2022
SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic Retinopathy Grading
Yijin Huang
Junyan Lyu
Pujin Cheng
Roger Tam
Xiaoying Tang
ViT
MedIm
19
20
0
20 Oct 2022
CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion
Philippe Weinzaepfel
Vincent Leroy
Thomas Lucas
Romain Brégier
Yohann Cabon
Vaibhav Arora
L. Antsfeld
Boris Chidlovskii
G. Csurka
Jérôme Revaud
SSL
42
64
0
19 Oct 2022
Intra-Source Style Augmentation for Improved Domain Generalization
Yumeng Li
Dan Zhang
M. Keuper
Anna Khoreva
29
32
0
18 Oct 2022
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
51
422
0
17 Oct 2022
How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
Qi Zhang
Yifei Wang
Yisen Wang
28
73
0
15 Oct 2022
How to Train Vision Transformer on Small-scale Datasets?
Hanan Gani
Muzammal Naseer
Mohammad Yaqub
ViT
20
51
0
13 Oct 2022
Exploring Long-Sequence Masked Autoencoders
Ronghang Hu
Shoubhik Debnath
Saining Xie
Xinlei Chen
8
18
0
13 Oct 2022
Point Transformer V2: Grouped Vector Attention and Partition-based Pooling
Xiaoyang Wu
Yixing Lao
Li Jiang
Xihui Liu
Hengshuang Zhao
3DPC
ViT
23
367
0
11 Oct 2022
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
Cheng-Hao Wang
Wenzhao Zheng
Zhengbiao Zhu
Jie Zhou
Jiwen Lu
SSL
AI4TS
50
4
0
11 Oct 2022
Turbo Training with Token Dropout
Tengda Han
Weidi Xie
Andrew Zisserman
ViT
34
10
0
10 Oct 2022
Denoising Masked AutoEncoders Help Robust Classification
Quanlin Wu
Hang Ye
Yuntian Gu
Huishuai Zhang
Liwei Wang
Di He
14
21
0
10 Oct 2022
MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Zijia Zhao
Longteng Guo
Xingjian He
Shuai Shao
Zehuan Yuan
Jing Liu
21
8
0
09 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
29
30
0
08 Oct 2022
Effective Self-supervised Pre-training on Low-compute Networks without Distillation
Fuwen Tan
F. Saleh
Brais Martínez
35
4
0
06 Oct 2022
Image Masking for Robust Self-Supervised Monocular Depth Estimation
Hemang Chawla
Kishaan Jeeveswaran
Elahe Arani
Bahram Zonooz
MDE
43
7
0
05 Oct 2022
Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders
Youngwan Lee
Jeffrey Willette
Jonghee Kim
Juho Lee
Sung Ju Hwang
34
16
0
05 Oct 2022
Backdoor Attacks in the Supply Chain of Masked Image Modeling
Xinyue Shen
Xinlei He
Zheng Li
Yun Shen
Michael Backes
Yang Zhang
46
8
0
04 Oct 2022
MTSMAE: Masked Autoencoders for Multivariate Time-Series Forecasting
Peiwang Tang
Xianchao Zhang
AI4TS
35
12
0
04 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
35
25
0
03 Oct 2022
Federated Training of Dual Encoding Models on Small Non-IID Client Datasets
Raviteja Vemulapalli
Warren Morningstar
Philip Mansfield
Hubert Eichner
K. Singhal
Arash Afkanpour
Bradley Green
FedML
39
2
0
30 Sep 2022
Rethinking the Learning Paradigm for Facial Expression Recognition
Weijie Wang
N. Sebe
Bruno Lepri
36
2
0
30 Sep 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
26
5
0
29 Sep 2022
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
33
68
0
29 Sep 2022
PicT: A Slim Weakly Supervised Vision Transformer for Pavement Distress Classification
Wenhao Tang
Shengyue Huang
Xiaoxian Zhang
Luwen Huangfu
ViT
37
2
0
21 Sep 2022
S
3
^3
3
R: Self-supervised Spectral Regression for Hyperspectral Histopathology Image Classification
Xingran Xie
Yan Wang
Qingli Li
56
4
0
19 Sep 2022
MetaMask: Revisiting Dimensional Confounder for Self-Supervised Learning
Jiangmeng Li
Jingyao Wang
Yanan Zhang
Wenyi Mo
Changwen Zheng
Bing-Huang Su
Hui Xiong
SSL
34
14
0
16 Sep 2022
Test-Time Training with Masked Autoencoders
Yossi Gandelsman
Yu Sun
Xinlei Chen
Alexei A. Efros
OOD
45
165
0
15 Sep 2022
Exploring Target Representations for Masked Autoencoders
Xingbin Liu
Jinghao Zhou
Tao Kong
Xianming Lin
Rongrong Ji
100
50
0
08 Sep 2022
MimCo: Masked Image Modeling Pre-training with Contrastive Teacher
Qiang-feng Zhou
Chaohui Yu
Haowen Luo
Zhibin Wang
Hao Li
VLM
56
20
0
07 Sep 2022
Statistical Foundation Behind Machine Learning and Its Impact on Computer Vision
Lei Zhang
H. Shum
VLM
SSL
22
2
0
06 Sep 2022
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling
Tsu-jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
William Yang Wang
Lijuan Wang
Zicheng Liu
VLM
26
64
0
04 Sep 2022
Visual Prompting via Image Inpainting
Amir Bar
Yossi Gandelsman
Trevor Darrell
Amir Globerson
Alexei A. Efros
VLM
VPVLM
24
200
0
01 Sep 2022
Masked Autoencoders Enable Efficient Knowledge Distillers
Yutong Bai
Zeyu Wang
Junfei Xiao
Chen Wei
Huiyu Wang
Alan Yuille
Yuyin Zhou
Cihang Xie
CLL
32
39
0
25 Aug 2022
Accelerating Vision Transformer Training via a Patch Sampling Schedule
Bradley McDanel
C. Huynh
ViT
30
1
0
19 Aug 2022
VLMAE: Vision-Language Masked Autoencoder
Su He
Taian Guo
Tao Dai
Ruizhi Qiao
Chen Wu
Xiujun Shu
Bohan Ren
VLM
34
11
0
19 Aug 2022
Previous
1
2
3
...
14
15
16
17
Next