Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09886
Cited By
SimMIM: A Simple Framework for Masked Image Modeling
18 November 2021
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SimMIM: A Simple Framework for Masked Image Modeling"
50 / 849 papers shown
Title
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
Amin Karimi Monsefi
Mengxi Zhou
Nastaran Karimi Monsefi
Ser-Nam Lim
Wei-Lun Chao
R. Ramnath
46
1
0
16 Sep 2024
Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing
Minh-Duc Vu
Zuheng Ming
Fangchen Feng
Bissmella Bahaduri
A. Mokraoui
ObjD
27
0
0
13 Sep 2024
Autoregressive Sequence Modeling for 3D Medical Image Representation
Siwen Wang
Churan Wang
Fei Gao
Lixian Su
Fandong Zhang
Yizhou Wang
Yizhou Yu
MedIm
23
1
0
13 Sep 2024
Hybrid-TTA: Continual Test-time Adaptation via Dynamic Domain Shift Detection
Hyewon Park
Hyejin Park
Jueun Ko
Dongbo Min
TTA
33
0
0
13 Sep 2024
Cross-conditioned Diffusion Model for Medical Image to Image Translation
Zhaohu Xing
Sicheng Yang
Sixiang Chen
Tian-Chun Ye
Yijun Yang
Jing Qin
Lei Zhu
DiffM
MedIm
42
6
0
13 Sep 2024
SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
Chenyang Lei
Liyi Chen
Jun Cen
Xiao Chen
Zhen Lei
Felix Heide
Ziwei Liu
Qifeng Chen
Zhaoxiang Zhang
47
0
0
12 Sep 2024
Revisiting Prompt Pretraining of Vision-Language Models
Zhenyuan Chen
Lingfeng Yang
Shuo Chen
Zhaowei Chen
Jiajun Liang
Xiang Li
MLLM
VPVLM
VLM
43
1
0
10 Sep 2024
DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks
Amin Karimi Monsefi
Kishore Prakash Sailaja
Ali Alilooee
Ser-Nam Lim
R. Ramnath
VLM
37
6
0
10 Sep 2024
iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation
Hayeon Jo
Hyesong Choi
Minhee Cho
Dongbo Min
38
1
0
04 Sep 2024
Collaborative Learning for Enhanced Unsupervised Domain Adaptation
Minhee Cho
Hyesong Choi
Hayeon Jo
Dongbo Min
27
1
0
04 Sep 2024
Dual Advancement of Representation Learning and Clustering for Sparse and Noisy Images
Wenlin Li
Yucheng Xu
Xiaoqing Zheng
Suoya Han
Jun Wang
Xiaobo Sun
36
0
0
03 Sep 2024
MaskMol: Knowledge-guided Molecular Image Pre-Training Framework for Activity Cliffs
Zhixiang Cheng
Hongxin Xiang
Pengsen Ma
Li Zeng
Xin Jin
...
Yang Deng
Bosheng Song
Xinxin Feng
Changhui Deng
Xiangxiang Zeng
28
0
0
02 Sep 2024
Self-Supervised Vision Transformers for Writer Retrieval
Tim Raven
Arthur Matei
Gernot A. Fink
ViT
25
0
0
01 Sep 2024
A Survey of the Self Supervised Learning Mechanisms for Vision Transformers
Asifullah Khan
A. Sohail
M. Fiaz
Mehdi Hassan
Tariq Habib Afridi
...
Muhammad Zaigham Zaheer
Kamran Ali
Tangina Sultana
Ziaurrehman Tanoli
Naeem Akhter
45
3
0
30 Aug 2024
MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation
Linyan Yang
Lukas Hoyer
Mark Weber
Tobias Fischer
Dengxin Dai
Laura Leal-Taixé
Marc Pollefeys
Daniel Cremers
Luc Van Gool
MDE
40
3
0
29 Aug 2024
Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs
Sarthak Yadav
Sergios Theodoridis
Zheng-Hua Tan
45
2
0
29 Aug 2024
Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language Instruction Tuning for Semiconductor Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Chidaksh Ravuru
Geethan Sannidhi
Venkataramana Runkana
43
0
0
27 Aug 2024
Multi-Modal Instruction-Tuning Small-Scale Language-and-Vision Assistant for Semiconductor Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Geethan Sannidhi
Venkataramana Runkana
38
1
0
27 Aug 2024
Hierarchical Network Fusion for Multi-Modal Electron Micrograph Representation Learning with Foundational Large Language Models
Sakhinana Sagar Srinivas
Geethan Sannidhi
Venkataramana Runkana
35
0
0
24 Aug 2024
Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models
Sakhinana Sagar Srinivas
Geethan Sannidhi
Sreeja Gangasani
Chidaksh Ravuru
Venkataramana Runkana
32
0
0
24 Aug 2024
Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption
Sakhinana Sagar Srinivas
Chidaksh Ravuru
Geethan Sannidhi
Venkataramana Runkana
41
0
0
23 Aug 2024
Symmetric masking strategy enhances the performance of Masked Image Modeling
Khanh-Binh Nguyen
Chae Jung Park
34
0
0
23 Aug 2024
EMCNet : Graph-Nets for Electron Micrographs Classification
Sakhinana Sagar Srinivas
Rajat Kumar Sarkar
Venkataramana Runkana
40
0
0
21 Aug 2024
Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?
Chen Liang
Qiang Guo
Xiaochao Qu
Luoqi Liu
Ting Liu
VOS
34
0
0
20 Aug 2024
SpectralEarth: Training Hyperspectral Foundation Models at Scale
Nassim Ait Ali Braham
C. Albrecht
Julien Mairal
J. Chanussot
Yi Wang
X. Zhu
38
12
0
15 Aug 2024
Membership Inference Attack Against Masked Image Modeling
Zehan Li
Xinlei He
Ning Yu
Yang Zhang
42
1
0
13 Aug 2024
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
72
6
0
13 Aug 2024
Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes
Ke Zhou
Zhongwei Qiu
Dongmei Fu
VLM
37
1
0
12 Aug 2024
Enhancing 3D Transformer Segmentation Model for Medical Image with Token-level Representation Learning
Xinrong Hu
Dewen Zeng
Yawen Wu
Xueyang Li
Yiyu Shi
ViT
MedIm
39
0
0
12 Aug 2024
HySparK: Hybrid Sparse Masking for Large Scale Medical Image Pre-Training
Fenghe Tang
Ronghao Xu
Qingsong Yao
Xueming Fu
Quan Quan
Heqin Zhu
Zaiyi Liu
S. Kevin Zhou
SSL
MedIm
40
3
0
11 Aug 2024
PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification
Bin Hu
Xinggang Wang
Wenyu Liu
ViT
39
3
0
10 Aug 2024
AggSS: An Aggregated Self-Supervised Approach for Class-Incremental Learning
Jayateja Kalla
Soma Biswas
SSL
31
0
0
08 Aug 2024
LEGO: Self-Supervised Representation Learning for Scene Text Images
Yujin Ren
Jiaxin Zhang
Lianwen Jin
SSL
31
0
0
04 Aug 2024
Dataset Scale and Societal Consistency Mediate Facial Impression Bias in Vision-Language AI
Robert Wolfe
Aayushi Dangol
Alexis Hiniker
Bill Howe
34
2
0
04 Aug 2024
Masked Angle-Aware Autoencoder for Remote Sensing Images
Zhihao Li
B. Hou
Siteng Ma
Zitong Wu
Xianpeng Guo
Bo Ren
Licheng Jiao
46
11
0
04 Aug 2024
Contribution-based Low-Rank Adaptation with Pre-training Model for Real Image Restoration
Donwon Park
Leixian Shen
Se Young Chun
24
2
0
02 Aug 2024
POA: Pre-training Once for Models of All Sizes
Yingying Zhang
Xin Guo
Jiangwei Lao
Lei Yu
Lixiang Ru
Jian Wang
Guo Ye
Huimei He
Jingdong Chen
Ming Yang
65
1
0
02 Aug 2024
Advancing Medical Image Segmentation: Morphology-Driven Learning with Diffusion Transformer
Sungmin Kang
Jaeha Song
Jihie Kim
MedIm
36
2
0
01 Aug 2024
Noise-Resilient Unsupervised Graph Representation Learning via Multi-Hop Feature Quality Estimation
Shiyuan Li
Yixin Liu
Qingfeng Chen
Geoffrey I. Webb
Shirui Pan
SSL
37
4
0
29 Jul 2024
Self-Supervised Learning for Text Recognition: A Critical Survey
Carlos Peñarrubia
J. J. Valero-Mas
Jorge Calvo-Zaragoza
69
1
0
29 Jul 2024
MMCLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training
Biao Wu
Yutong Xie
Zeyu Zhang
Minh Hieu Phan
Qi Chen
Ling-Hao Chen
Qi Wu
LM&MA
37
0
0
28 Jul 2024
QPT V2: Masked Image Modeling Advances Visual Scoring
Qizhi Xie
Kun Yuan
Yunpeng Qu
Mingda Wu
Ming-hui Sun
Chao Zhou
Jihong Zhu
39
3
0
23 Jul 2024
A Multi-view Mask Contrastive Learning Graph Convolutional Neural Network for Age Estimation
Yiping Zhang
Yuntao Shou
Tao Meng
Wei Ai
Keqin Li
CVBM
48
10
0
23 Jul 2024
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei
Abhinav Gupta
Pedro Morgado
SSL
47
7
0
22 Jul 2024
SIGMA:Sinkhorn-Guided Masked Video Modeling
Mohammadreza Salehi
Michael Dorkenwald
Fida Mohammad Thoker
E. Gavves
Cees G. M. Snoek
Yuki M. Asano
55
3
0
22 Jul 2024
Improving Representation of High-frequency Components for Medical Visual Foundation Models
Yuetan Chu
Yilan Zhang
Zhongyi Han
Changchun Yang
Longxi Zhou
Gongning Luo
Chao Huang
Xin Gao
MedIm
47
1
0
19 Jul 2024
Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition
Gagan Bhatia
El Moatez Billah Nagoudi
Fakhraddin Alwajih
Muhammad Abdul-Mageed
34
3
0
18 Jul 2024
ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders
Carlos Hinojosa
Shuming Liu
Guohao Li
26
2
0
17 Jul 2024
Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation
Hyun Seok Seong
WonJun Moon
Subeen Lee
Jae-Pil Heo
40
0
0
17 Jul 2024
A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification
Markus Marks
Manuel Knott
Neehar Kondapaneni
Elijah Cole
T. Defraeye
Fernando Pérez-Cruz
Pietro Perona
SSL
45
2
0
16 Jul 2024
Previous
1
2
3
4
5
6
...
15
16
17
Next