Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,777 papers shown
Title
ReFu: Recursive Fusion for Exemplar-Free 3D Class-Incremental Learning
Yi Yang
Lei Zhong
Huiping Zhuang
3DPC
CLL
95
0
0
18 Sep 2024
Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification
Fatema Jannat
Sina Gholami
Jennifer I. Lim
Theodore Leng
Minhaj Nur Alam
Hamed Tabkhi
48
1
0
17 Sep 2024
Identifying Influential nodes in Brain Networks via Self-Supervised Graph-Transformer
Yanqing Kang
Di Zhu
Haiyang Zhang
Enze Shi
Sigang Yu
...
Xuan Liu
Geng Chen
Xi Jiang
Tuo Zhang
Shu Zhang
51
0
0
17 Sep 2024
Sparks of Artificial General Intelligence(AGI) in Semiconductor Material Science: Early Explorations into the Next Frontier of Generative AI-Assisted Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Geethan Sannidhi
Sreeja Gangasani
Chidaksh Ravuru
Venkataramana Runkana
94
0
0
17 Sep 2024
MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion
Lehong Wu
Lilang Lin
Jiahang Zhang
Yi Ma
Jiaying Liu
DiffM
105
2
0
16 Sep 2024
Rapid Adaptation of Earth Observation Foundation Models for Segmentation
Karthick Panner Selvam
Raúl Ramos-Pollán
F. Kalaitzis
AI4CE
93
5
0
16 Sep 2024
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
Amin Karimi Monsefi
Mengxi Zhou
Nastaran Karimi Monsefi
Ser-Nam Lim
Wei-Lun Chao
R. Ramnath
132
1
0
16 Sep 2024
Investigation of Hierarchical Spectral Vision Transformer Architecture for Classification of Hyperspectral Imagery
Wei Liu
Saurabh Prasad
Melba M. Crawford
70
4
0
14 Sep 2024
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval
Amirreza Mahbod
Nematollah Saeidi
Sepideh Hatamikia
Ramona Woitek
VLM
MedIm
126
4
0
14 Sep 2024
Phikon-v2, A large and public feature extractor for biomarker prediction
Alexandre Filiot
Paul Jacob
Alice Mac Kain
Charlie Saillard
MedIm
87
21
0
13 Sep 2024
Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing
Minh-Duc Vu
Zuheng Ming
Fangchen Feng
Bissmella Bahaduri
A. Mokraoui
ObjD
49
0
0
13 Sep 2024
Uncertainty and Generalizability in Foundation Models for Earth Observation
Raúl Ramos-Pollán
F. Kalaitzis
Karthick Panner Selvam
50
0
0
13 Sep 2024
HTR-VT: Handwritten Text Recognition with Vision Transformer
Yuting Li
Dexiong Chen
Tinglong Tang
Xi Shen
ViT
63
13
0
13 Sep 2024
Hybrid-TTA: Continual Test-time Adaptation via Dynamic Domain Shift Detection
Hyewon Park
Hyejin Park
Jueun Ko
Dongbo Min
TTA
80
0
0
13 Sep 2024
Exploiting Supervised Poison Vulnerability to Strengthen Self-Supervised Defense
Jeremy A. Styborski
Mingzhi Lyu
Yunpeng Huang
Adams Kong
113
0
0
13 Sep 2024
VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation
Hanning Chen
Yang Ni
Wenjun Huang
Yezi Liu
SungHeon Jeong
Fei Wen
Nathaniel D. Bastian
Hugo Latapie
Mohsen Imani
VLM
85
4
0
13 Sep 2024
Autoregressive Sequence Modeling for 3D Medical Image Representation
Siwen Wang
Churan Wang
Fei Gao
Lixian Su
Fandong Zhang
Yizhou Wang
Yizhou Yu
MedIm
127
1
0
13 Sep 2024
Hand-Object Interaction Pretraining from Videos
Himanshu Gaurav Singh
Antonio Loquercio
Carmelo Sferrazza
Jane Wu
Haozhi Qi
Pieter Abbeel
Jitendra Malik
88
18
0
12 Sep 2024
SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
Chenyang Lei
Liyi Chen
Jun Cen
Xiao Chen
Zhen Lei
Felix Heide
Ziwei Liu
Qifeng Chen
Zhaoxiang Zhang
97
0
0
12 Sep 2024
Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation?
Kerem Cekmeceli
Meva Himmetoglu
G. I. Tombak
A. Susmelj
Ertunc Erdil
E. Konukoglu
MedIm
59
3
0
12 Sep 2024
UNIT: Unsupervised Online Instance Segmentation through Time
Corentin Sautier
Gilles Puy
Alexandre Boulch
Renaud Marlet
Vincent Lepetit
94
1
0
12 Sep 2024
Early Joint Learning of Emotion Information Makes MultiModal Model Understand You Better
Mengying Ge
Mingyang Li
Dongkai Tang
Pengbo Li
Kuo Liu
Shuhao Deng
Songbai Pu
Liu Liu
Yang Song
Tao Zhang
80
0
0
12 Sep 2024
Learning Brain Tumor Representation in 3D High-Resolution MR Images via Interpretable State Space Models
Qingqiao Hu
Daoan Zhang
Jiebo Luo
Zhenyu Gong
Benedikt Wiestler
Jianguo Zhang
Hongwei Bran Li
55
0
0
12 Sep 2024
Token Turing Machines are Efficient Vision Models
Purvish Jajal
Nick Eliopoulos
Benjamin Shiue-Hal Chou
George K. Thiravathukal
James C. Davis
Yung-Hsiang Lu
181
0
0
11 Sep 2024
Automated Discovery of Pairwise Interactions from Unstructured Data
Zuheng
Xu
Moksh Jain
Ali Denton
Shawn Whitfield
Aniket Didolkar
Berton Earnshaw
Jason S. Hartford
75
5
0
11 Sep 2024
Self-Masking Networks for Unsupervised Adaptation
Alfonso Taboada Warmerdam
Mathilde Caron
Yuki M. Asano
84
2
0
11 Sep 2024
PaveSAM Segment Anything for Pavement Distress
Neema Jakisa Owor
Y. Adu-Gyamfi
Armstrong Aboah
M. Amo-Boateng
VLM
77
6
0
11 Sep 2024
Swin-LiteMedSAM: A Lightweight Box-Based Segment Anything Model for Large-Scale Medical Image Datasets
Ruochen Gao
Donghang Lyu
Marius Staring
VLM
MedIm
53
4
0
11 Sep 2024
Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts
Assefa Seyoum Wahd
B. Felfeliyan
Yuyue Zhou
Shrimanti Ghosh
Adam McArthur
Jiechen Zhang
Jacob L. Jaremko
A. Hareendranathan
VLM
MedIm
90
1
0
10 Sep 2024
How Molecules Impact Cells: Unlocking Contrastive PhenoMolecular Retrieval
Philip Fradkin
Puria Azadi
Karush Suri
Frederik Wenkel
A. Bashashati
Maciej Sypetkowski
Dominique Beaini
87
5
0
10 Sep 2024
Hierarchical Multi-Label Classification with Missing Information for Benthic Habitat Imagery
Isaac Xu
B. Misiuk
Scott C. Lowe
Martin Gillis
Craig J. Brown
Thomas Trappenberg
SSL
63
3
0
10 Sep 2024
High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study
Shijie Chang
Lihe Zhang
Huchuan Lu
VLM
69
1
0
10 Sep 2024
Revisiting Prompt Pretraining of Vision-Language Models
Zhenyuan Chen
Lingfeng Yang
Shuo Chen
Zhaowei Chen
Jiajun Liang
Xiang Li
MLLM
VPVLM
VLM
121
2
0
10 Sep 2024
ReAugment: Model Zoo-Guided RL for Few-Shot Time Series Augmentation and Forecasting
Haochen Yuan
Yutong Wang
Yihong Chen
Yunbo Wang
Xiaokang Yang
AI4TS
81
0
0
10 Sep 2024
AMNS: Attention-Weighted Selective Mask and Noise Label Suppression for Text-to-Image Person Retrieval
Runqing Zhang
Xue Zhou
171
1
0
10 Sep 2024
DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks
Amin Karimi Monsefi
Kishore Prakash Sailaja
Ali Alilooee
Ser-Nam Lim
R. Ramnath
VLM
102
9
0
10 Sep 2024
Deep Learning for Video Anomaly Detection: A Review
Peng Wu
Chengyu Pan
Yuting Yan
Guansong Pang
Peng Wang
Yanning Zhang
VLM
AI4TS
80
11
0
09 Sep 2024
Driving with Prior Maps: Unified Vector Prior Encoding for Autonomous Vehicle Mapping
Shuang Zeng
Xinyuan Chang
Xinran Liu
Zheng Pan
Xing Wei
129
3
0
09 Sep 2024
A foundation model enpowered by a multi-modal prompt engine for universal seismic geobody interpretation across surveys
Hang Gao
Xinming Wu
Luming Liang
Hanlin Sheng
Xu Si
Gao Hui
Yaxing Li
AI4CE
74
2
0
08 Sep 2024
Explicit Mutual Information Maximization for Self-Supervised Learning
Lele Chang
Peilin Liu
Qinghai Guo
Fei Wen
SSL
87
0
0
07 Sep 2024
Improving agent performance in fluid environments by perceptual pretraining
Jin Zhang
Jianyang Xue
Bochao Cao
AI4CE
61
0
0
05 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
142
23
0
05 Sep 2024
Collaborative Learning for Enhanced Unsupervised Domain Adaptation
Minhee Cho
Hyesong Choi
Hayeon Jo
Dongbo Min
167
1
0
04 Sep 2024
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
Haiyu Wu
Jaskirat Singh
Sicong Tian
Liang Zheng
Kevin W. Bowyer
CVBM
145
4
0
04 Sep 2024
iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation
Hayeon Jo
Hyesong Choi
Minhee Cho
Dongbo Min
124
2
0
04 Sep 2024
Human-AI Collaborative Multi-modal Multi-rater Learning for Endometriosis Diagnosis
Hu Wang
David Butler
Yuan Zhang
Jodie C Avery
Steven Knox
Congbo Ma
Louise Hull
Gustavo Carneiro
65
2
0
03 Sep 2024
Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment
Konstantin Schall
Kai Uwe Barthel
Nico Hezel
Klaus Jung
VLM
92
3
0
03 Sep 2024
AstroMAE: Redshift Prediction Using a Masked Autoencoder with a Novel Fine-Tuning Architecture
Amirreza Dolatpour Fathkouhi
Geoffrey Charles Fox
30
1
0
03 Sep 2024
UWStereo: A Large Synthetic Dataset for Underwater Stereo Matching
Qingxuan Lv
Junyu Dong
Yuezun Li
Sheng Chen
Hui Yu
Shu Zhang
Wenhan Wang
3DV
69
0
0
03 Sep 2024
Dreaming is All You Need
Mingze Ni
Wei Liu
53
0
0
03 Sep 2024
Previous
1
2
3
...
21
22
23
...
94
95
96
Next