Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.05656
Cited By
MST: Masked Self-Supervised Transformer for Visual Representation
10 June 2021
Zhaowen Li
Zhiyang Chen
Fan Yang
Wei Li
Yousong Zhu
Chaoyang Zhao
Rui Deng
Liwei Wu
Rui Zhao
Ming Tang
Jinqiao Wang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MST: Masked Self-Supervised Transformer for Visual Representation"
44 / 44 papers shown
Title
Evolved Hierarchical Masking for Self-Supervised Learning
Zhanzhou Feng
Shiliang Zhang
49
0
0
12 Apr 2025
MIMRS: A Survey on Masked Image Modeling in Remote Sensing
Shabnam Choudhury
Akhil Vasim
Michael Schmitt
Biplab Banerjee
38
0
0
04 Apr 2025
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
122
4
0
11 Feb 2025
A Survey of the Self Supervised Learning Mechanisms for Vision Transformers
Asifullah Khan
A. Sohail
M. Fiaz
Mehdi Hassan
Tariq Habib Afridi
...
Muhammad Zaigham Zaheer
Kamran Ali
Tangina Sultana
Ziaurrehman Tanoli
Naeem Akhter
45
3
0
30 Aug 2024
Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?
Chen Liang
Qiang Guo
Xiaochao Qu
Luoqi Liu
Ting Liu
VOS
34
0
0
20 Aug 2024
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
72
6
0
13 Aug 2024
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis
Jiaxin Zhuang
Linshan Wu
Qiong Wang
V. Vardhanabhuti
Lin Luo
Hao Chen
Hao Chen
57
4
0
24 Apr 2024
Siamese Vision Transformers are Scalable Audio-visual Learners
Yan-Bo Lin
Gedas Bertasius
37
5
0
28 Mar 2024
Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization
Han Guo
Ramtin Hosseini
Ruiyi Zhang
Sai Ashish Somayajula
Ranak Roy Chowdhury
Rajesh K. Gupta
Pengtao Xie
36
0
0
28 Feb 2024
Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
Amin Karimi Monsefi
Payam Karisani
Mengxi Zhou
Stacey S. Choi
Nathan Doble
Heng Ji
Srinivasan Parthasarathy
R. Ramnath
43
5
0
09 Feb 2024
LMD: Faster Image Reconstruction with Latent Masking Diffusion
Zhiyuan Ma
Zhihuan Yu
Jianjun Li
Bowen Zhou
DiffM
24
8
0
13 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
37
4
0
05 Dec 2023
Enhancing Representations through Heterogeneous Self-Supervised Learning
Zhongyu Li
Bo-Wen Yin
Yongxiang Liu
Li Liu
Ming-Ming Cheng
SSL
28
2
0
08 Oct 2023
Make A Long Image Short: Adaptive Token Length for Vision Transformers
Yuqin Zhu
Yichen Zhu
ViT
72
17
0
05 Jul 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLL
SSL
32
4
0
23 May 2023
FreConv: Frequency Branch-and-Integration Convolutional Networks
Zhaowen Li
Xu Zhao
Peigeng Ding
Zongxin Gao
Yuting Yang
Ming Tang
Jinqiao Wang
26
2
0
10 Apr 2023
Remote Sensing Scene Classification with Masked Image Modeling (MIM)
Liya Wang
A. Tien
35
3
0
28 Feb 2023
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
35
50
0
13 Feb 2023
Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
Liya Wang
A. Tien
44
7
0
28 Jan 2023
Understanding Self-Supervised Pretraining with Part-Aware Representation Learning
Jie Zhu
Jiyang Qi
Mingyu Ding
Xiaokang Chen
Ping Luo
Xinggang Wang
Wenyu Liu
Leye Wang
Jingdong Wang
SSL
33
8
0
27 Jan 2023
MEDIAR: Harmony of Data-Centric and Model-Centric for Multi-Modality Microscopy
Gihun Lee
Sangmook Kim
Joonkee Kim
Se-Young Yun
MedIm
19
18
0
07 Dec 2022
CroCo v2: Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow
Philippe Weinzaepfel
Thomas Lucas
Vincent Leroy
Yohann Cabon
Vaibhav Arora
Romain Brégier
G. Csurka
L. Antsfeld
Boris Chidlovskii
Jérôme Revaud
ViT
23
82
0
18 Nov 2022
Masked Contrastive Representation Learning
Yuan Yao
Nandakishor Desai
M. Palaniswami
SSL
22
8
0
11 Nov 2022
Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers
Zhiwei Lin
Ze Yang
Yongtao Wang
ViT
36
2
0
24 Oct 2022
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
34
32
0
21 Oct 2022
SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic Retinopathy Grading
Yijin Huang
Junyan Lyu
Pujin Cheng
Roger Tam
Xiaoying Tang
ViT
MedIm
19
20
0
20 Oct 2022
CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion
Philippe Weinzaepfel
Vincent Leroy
Thomas Lucas
Romain Brégier
Yohann Cabon
Vaibhav Arora
L. Antsfeld
Boris Chidlovskii
G. Csurka
Jérôme Revaud
SSL
42
64
0
19 Oct 2022
Improving Dense Contrastive Learning with Dense Negative Pairs
Berk Iskender
Zhenlin Xu
Simon Kornblith
Enhung Chu
M. Khademi
SSL
31
1
0
11 Oct 2022
Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders
Haosen Yang
Deng Huang
Bin Wen
Jiannan Wu
H. Yao
Yi-Xin Jiang
Xiatian Zhu
Zehuan Yuan
37
19
0
09 Oct 2022
TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut
Yangtao Wang
Xiaoke Shen
Yuan. Yuan
Yuming Du
Maomao Li
S. Hu
James L. Crowley
Dominique Vaufreydaz
VOS
ViT
27
76
0
01 Sep 2022
Transfering Low-Frequency Features for Domain Adaptation
Zhaowen Li
Xu Zhao
Chaoyang Zhao
Ming Tang
Jinqiao Wang
41
7
0
31 Aug 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
57
71
0
30 Jul 2022
Teach me how to Interpolate a Myriad of Embeddings
Shashanka Venkataramanan
Ewa Kijak
Laurent Amsaleg
Yannis Avrithis
43
2
0
29 Jun 2022
Masked World Models for Visual Control
Younggyo Seo
Danijar Hafner
Hao Liu
Fangchen Liu
Stephen James
Kimin Lee
Pieter Abbeel
OffRL
87
146
0
28 Jun 2022
Rethinking Generalization in Few-Shot Classification
Markus Hiller
Rongkai Ma
Mehrtash Harandi
Tom Drummond
OCL
VLM
30
55
0
15 Jun 2022
Extreme Masking for Learning Instance and Distributed Visual Representations
Zhirong Wu
Zihang Lai
Xiao Sun
Stephen Lin
32
22
0
09 Jun 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
115
17
0
30 May 2022
Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)
Jue Jiang
N. Tyagi
K. Tringale
C. Crane
Harini Veeraraghavan
MedIm
36
34
0
20 May 2022
Representation Learning by Detecting Incorrect Location Embeddings
Sepehr Sameni
Simon Jenni
Paolo Favaro
ViT
34
4
0
10 Apr 2022
iBOT: Image BERT Pre-Training with Online Tokenizer
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
21
710
0
15 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
77
330
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
326
5,785
0
29 Apr 2021
Instance Localization for Self-supervised Detection Pretraining
Ceyuan Yang
Zhirong Wu
Bolei Zhou
Stephen Lin
ViT
SSL
100
145
0
16 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
267
3,375
0
09 Mar 2020
1