Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09886
Cited By
SimMIM: A Simple Framework for Masked Image Modeling
18 November 2021
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SimMIM: A Simple Framework for Masked Image Modeling"
50 / 849 papers shown
Title
On Pretraining Data Diversity for Self-Supervised Learning
Hasan Hammoud
Tuhin Das
Fabio Pizzati
Philip Torr
Adel Bibi
Guohao Li
103
2
0
20 Mar 2024
MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining
Di Wang
Jing Zhang
Minqiang Xu
Lin Liu
Dongsheng Wang
...
Chengxi Han
Haonan Guo
Bo Du
Dacheng Tao
L. Zhang
39
44
0
20 Mar 2024
LocalMamba: Visual State Space Model with Windowed Selective Scan
Tao Huang
Xiaohuan Pei
Shan You
Fei-Yue Wang
Chao Qian
Chang Xu
Mamba
45
139
0
14 Mar 2024
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
63
6
0
14 Mar 2024
MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
Jialv Zou
Bencheng Liao
Qian Zhang
Wenyu Liu
Xinggang Wang
49
2
0
13 Mar 2024
LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition
Zhonglin Sun
Chen Feng
Ioannis Patras
Georgios Tzimiropoulos
CVBM
SSL
35
3
0
13 Mar 2024
CAMSIC: Content-aware Masked Image Modeling Transformer for Stereo Image Compression
Xinjie Zhang
Shenyuan Gao
Zhening Liu
Jiawei Shao
Xingtong Ge
Dailan He
Tongda Xu
Yan Wang
Jun Zhang
48
1
0
13 Mar 2024
Masked AutoDecoder is Effective Multi-Task Vision Generalist
Han Qiu
Jiaxing Huang
Peng Gao
Lewei Lu
Xiaoqin Zhang
Shijian Lu
51
4
0
12 Mar 2024
Can Generative Models Improve Self-Supervised Representation Learning?
Sana Ayromlou
Arash Afkanpour
Vahid Reza Khazaie
Fereshteh Forghani
31
3
0
09 Mar 2024
Fine-tuning a Multiple Instance Learning Feature Extractor with Masked Context Modelling and Knowledge Distillation
Juan Pisula
Katarzyna Bozek
35
2
0
08 Mar 2024
Spatiotemporal Predictive Pre-training for Robotic Motor Control
Jiange Yang
Bei Liu
Jianlong Fu
Bocheng Pan
Gangshan Wu
Limin Wang
42
10
0
08 Mar 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLM
CLIP
67
12
0
05 Mar 2024
Vision-Language Models for Medical Report Generation and Visual Question Answering: A Review
Iryna Hartsock
Ghulam Rasool
49
62
0
04 Mar 2024
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
Xiangxiang Chu
Jianlin Su
Bo-Wen Zhang
Chunhua Shen
MLLM
44
10
0
01 Mar 2024
Data-efficient Event Camera Pre-training via Disentangled Masked Modeling
Zhenpeng Huang
Chao Li
Hao Chen
Yongjian Deng
Yifeng Geng
Limin Wang
45
2
0
01 Mar 2024
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
Chen Duan
Pei Fu
Shan Guo
Qianyi Jiang
Xiaoming Wei
VLM
46
5
0
01 Mar 2024
MaskLRF: Self-supervised Pretraining via Masked Autoencoding of Local Reference Frames for Rotation-invariant 3D Point Set Analysis
Takahiko Furuya
3DPC
43
2
0
01 Mar 2024
VideoMAC: Video Masked Autoencoders Meet ConvNets
Gensheng Pei
Tao Chen
XiRuo Jiang
Huafeng Liu
Zeren Sun
Yazhou Yao
VGen
42
9
0
29 Feb 2024
Self-Supervised Learning with Generative Adversarial Networks for Electron Microscopy
Bashir Kazimi
Karina Ruzaeva
Stefan Sandfeld
39
4
0
28 Feb 2024
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
Jiequan Cui
Beier Zhu
Xin Wen
Xiaojuan Qi
Bei Yu
Hanwang Zhang
25
7
0
28 Feb 2024
MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation
Hanan Gani
Muzammal Naseer
Fahad Khan
Salman Khan
28
0
0
27 Feb 2024
LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning
Shentong Mo
Yansen Wang
Xufang Luo
Dongsheng Li
VLM
41
1
0
27 Feb 2024
Attention-Guided Masked Autoencoders For Learning Image Representations
Leon Sick
Dominik Engel
Pedro Hermosilla
Timo Ropinski
34
1
0
23 Feb 2024
The Common Stability Mechanism behind most Self-Supervised Learning Approaches
Abhishek Jha
Matthew B. Blaschko
Yuki M. Asano
Tinne Tuytelaars
SSL
35
1
0
22 Feb 2024
Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning
Johnathan Xie
Yoonho Lee
Annie S. Chen
Chelsea Finn
25
3
0
22 Feb 2024
Overcoming Dimensional Collapse in Self-supervised Contrastive Learning for Medical Image Segmentation
Jamshid Hassanpour
V. Srivastav
Didier Mutter
N. Padoy
SSL
57
2
0
22 Feb 2024
A Simple Framework Uniting Visual In-context Learning with Masked Image Modeling to Improve Ultrasound Segmentation
Yuyue Zhou
B. Felfeliyan
Shrimanti Ghosh
Jessica Knight
Fatima Alves-Pereira
Christopher Keen
Jessica Küpper
A. Hareendranathan
Jacob L. Jaremko
37
0
0
22 Feb 2024
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Chien-Yao Wang
I-Hau Yeh
Hongpeng Liao
57
1,151
0
21 Feb 2024
Revisiting Feature Prediction for Learning Visual Representations from Video
Adrien Bardes
Q. Garrido
Jean Ponce
Xinlei Chen
Michael G. Rabbat
Yann LeCun
Mahmoud Assran
Nicolas Ballas
MDE
VLM
92
73
0
15 Feb 2024
Learning Low-Rank Feature for Thorax Disease Classification
Rajeev Goel
Utkarsh Nath
Yancheng Wang
Alvin C. Silva
Teresa Wu
Yingzhen Yang
22
0
0
14 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
30
6
0
14 Feb 2024
Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
Amin Karimi Monsefi
Payam Karisani
Mengxi Zhou
Stacey S. Choi
Nathan Doble
Heng Ji
Srinivasan Parthasarathy
R. Ramnath
43
5
0
09 Feb 2024
Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts
Zhili Liu
Kai Chen
Jianhua Han
Lanqing Hong
Hang Xu
Zhenguo Li
James T. Kwok
MoE
111
24
0
08 Feb 2024
MOMENT: A Family of Open Time-series Foundation Models
Mononito Goswami
Konrad Szafer
Arjun Choudhry
Yifu Cai
Shuo Li
Artur Dubrawski
AIFin
AI4TS
71
111
0
06 Feb 2024
Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing
Yan Shu
Weichao Zeng
Zhenhang Li
Fangmin Zhao
Yu Zhou
32
3
0
05 Feb 2024
Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm Perspectives
Sheng Luo
Wei Chen
Wanxin Tian
Rui Liu
Luanxuan Hou
...
Ling Shao
Yi Yang
Bojun Gao
Qun Li
Guobin Wu
51
13
0
05 Feb 2024
TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling
Jiaxiang Dong
Haixu Wu
Yuxuan Wang
Yunzhong Qiu
Li Zhang
Jianmin Wang
Mingsheng Long
AI4TS
18
13
0
04 Feb 2024
Exploring Intrinsic Properties of Medical Images for Self-Supervised Binary Semantic Segmentation
P. Singh
Jacopo Cirrone
14
0
0
04 Feb 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Junlong Du
Yue Fan
Qing Li
Qing Li
Yuntao Du
VLM
75
75
0
03 Feb 2024
A Probabilistic Model behind Self-Supervised Learning
Alice Bizeul
Bernhard Schölkopf
Carl Allen
SSL
26
2
0
02 Feb 2024
Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram
Yeongyeon Na
Minje Park
Yunwon Tae
S. Joo
32
24
0
02 Feb 2024
SimAda: A Simple Unified Framework for Adapting Segment Anything Model in Underperformed Scenes
Yiran Song
Qianyu Zhou
Xuequan Lu
Zhiwen Shao
Lizhuang Ma
48
7
0
31 Jan 2024
Masked Audio Modeling with CLAP and Multi-Objective Learning
Yifei Xin
Xiulian Peng
Yan Lu
52
8
0
29 Jan 2024
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach
Shaofeng Zhang
Jinfa Huang
Qiang-feng Zhou
Zhibin Wang
Fan Wang
Jiebo Luo
Junchi Yan
DiffM
34
11
0
28 Jan 2024
Masked Pre-trained Model Enables Universal Zero-shot Denoiser
Xiaoxiao Ma
Zhixiang Wei
Yi Jin
Pengyang Ling
Tianle Liu
Ben Wang
Junkang Dai
H. Chen
Enhong Chen
VLM
43
0
0
26 Jan 2024
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Xinlei Chen
Zhuang Liu
Saining Xie
Kaiming He
DiffM
35
54
0
25 Jan 2024
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
34
14
0
25 Jan 2024
OCT-SelfNet: A Self-Supervised Framework with Multi-Modal Datasets for Generalized and Robust Retinal Disease Detection
Fatema Jannat
Sina Gholami
Minha Alam
Hamed Tabkhi
16
1
0
22 Jan 2024
Spatial Structure Constraints for Weakly Supervised Semantic Segmentation
Tao Chen
Yazhou Yao
Xing-Rui Huang
Zechao Li
Liqiang Nie
Jinhui Tang
29
15
0
20 Jan 2024
LDReg: Local Dimensionality Regularized Self-Supervised Learning
Hanxun Huang
R. Campello
S. Erfani
Xingjun Ma
Michael E. Houle
James Bailey
38
5
0
19 Jan 2024
Previous
1
2
3
...
5
6
7
...
15
16
17
Next