Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,777 papers shown
Title
Detect and Correct: A Selective Noise Correction Method for Learning with Noisy Labels
Yuval Grinberg
Nimrod Harel
Jacob Goldberger
Ofir Lindenbaum
NoLa
76
0
0
19 May 2025
MSVIT: Improving Spiking Vision Transformer Using Multi-scale Attention Fusion
Wei Hua
Chenlin Zhou
Jibin Wu
Yansong Chua
Yangyang Shu
110
0
0
19 May 2025
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Fei Xie
Jiahao Nie
Yujin Tang
W. Zhang
Hongshen Zhao
Mamba
142
0
0
19 May 2025
Spectral-Spatial Self-Supervised Learning for Few-Shot Hyperspectral Image Classification
Wenchen Chen
Yanmei Zhang
Zhongwei Xiao
Jianping Chu
Xingbo Wang
100
0
0
18 May 2025
PRETI: Patient-Aware Retinal Foundation Model via Metadata-Guided Representation Learning
Yeonkyung Lee
Woojung Han
Youngjun Jun
Hyeonmin Kim
Jungkyung Cho
Seong Jae Hwang
MedIm
70
0
0
18 May 2025
Exploring the Potential of SSL Models for Sound Event Detection
Hanfang Cui
Longfei Song
Li Li
Dongxing Xu
Yanhua Long
93
0
0
17 May 2025
Physics-informed Temporal Alignment for Auto-regressive PDE Foundation Models
Congcong Zhu
Xiaoyan Xu
Jiayue Han
Jingrun Chen
OOD
AI4CE
149
0
0
16 May 2025
Surgical Foundation Model Leveraging Compression and Entropy Maximization for Image-Guided Surgical Assistance
Lianhao Yin
O. Meireles
Guy Rosman
Daniela Rus
19
0
0
16 May 2025
CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning
Mingyu Lu
Ethan Weinberger
Chanwoo Kim
Su-In Lee
33
0
0
16 May 2025
GeoMM: On Geodesic Perspective for Multi-modal Learning
Shibin Mei
Hang Wang
Bingbing Ni
74
0
0
16 May 2025
Nearest Neighbor Multivariate Time Series Forecasting
Huiliang Zhang
Ping Nie
Lijun Sun
Benoit Boulet
AI4TS
109
1
0
16 May 2025
Self-supervised perception for tactile skin covered dexterous hands
Akash Sharma
Carolina Higuera
Chaithanya Krishna Bodduluri
Ziqiang Liu
Taosha Fan
...
Byron Boots
Michael Kaess
Tingfan Wu
Francois Robert Hogan
Mustafa Mukadam
SSL
84
2
0
16 May 2025
DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning
Weilai Xiang
Hongyu Yang
Di Huang
Yunhong Wang
120
0
0
16 May 2025
GAIA: A Foundation Model for Operational Atmospheric Dynamics
Ata Akbari Asanjan
Olivia Alexander
Tom Berg
Clara Zhang
Matt Yang
...
Stephen Peng
Arun Ravindran
Olivier Raiman
David Potere
David Bell
29
0
0
15 May 2025
A Unified and Scalable Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability
Jie Zhu
Jirong Zha
Ding Li
Leye Wang
131
1
0
15 May 2025
EmbodiedMAE: A Unified 3D Multi-Modal Representation for Robot Manipulation
Zibin Dong
Fei Ni
Yifu Yuan
Yinchuan Li
Jianye Hao
118
0
0
15 May 2025
Recent Advances in Medical Imaging Segmentation: A Survey
Fares Bougourzi
Abdenour Hadid
OOD
98
1
0
14 May 2025
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning
Berkay Guler
Giovanni Geraci
Hamid Jafarkhani
73
1
0
14 May 2025
BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis
Jiarun Liu
Hong-Yu Zhou
Weijian Huang
Hao Yang
Dongning Song
Tao Tan
Yong Liang
Shanshan Wang
MedIm
81
0
0
14 May 2025
An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care
Z. Soh
Yang Bai
Kai Yu
Yang Zhou
Xiaofeng Lei
...
J. Jonas
T. Y. Wong
Rick Siow Mong Goh
Yong Liu
Ching-Yu Cheng
35
0
0
13 May 2025
Few-shot Novel Category Discovery
Chunming Li
Shidong Wang
Haofeng Zhang
62
0
0
13 May 2025
VIViT: Variable-Input Vision Transformer Framework for 3D MR Image Segmentation
Badhan Kumar Das
Ajay Singh
Gengyan Zhao
Han Liu
Thomas J. Re
Dorin Comaniciu
Eli Gibson
Andreas Maier
ViT
MedIm
67
0
0
13 May 2025
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush K. Rai
Kyle Min
Tarun Krishna
Feiyan Hu
Alan F. Smeaton
Noel E. O'Connor
VGen
96
0
0
13 May 2025
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series
Xiaolei Qin
Di Wang
Jing Zhang
Fengxiang Wang
Xin Su
Bo Du
Liangpei Zhang
AI4TS
114
0
0
13 May 2025
DELPHYNE: A Pre-Trained Model for General and Financial Time Series
Xueying Ding
Aakriti Mittal
Achintya Gopal
AI4TS
19
0
0
12 May 2025
ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning
Hongyin Zhang
Zifeng Zhuang
Han Zhao
Pengxiang Ding
Hongchao Lu
Donglin Wang
OffRL
126
0
0
12 May 2025
Vision Foundation Model Embedding-Based Semantic Anomaly Detection
M. Ronecker
Matthew Foutter
Amine Elhafsi
Daniele Gammelli
Ihor Barakaiev
Marco Pavone
Daniel Watzenig
59
1
0
12 May 2025
Sleep Position Classification using Transfer Learning for Bed-based Pressure Sensors
Olivier Papillon
Rafik Goubran
James Green
Julien Larivière-Chartier
Caitlin Higginson
Frank Knoefel
Rébecca Robillard
46
0
0
12 May 2025
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence
Yu Qiao
Huy Q. Le
Avi Deb Raha
Phuong-Nam Tran
Apurba Adhikary
Mengchun Zhang
Loc X. Nguyen
Eui-nam Huh
Dusit Niyato
Choong Seon Hong
AI4CE
161
1
0
11 May 2025
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Zhengmi Tang
Yuto Mitsui
Tomo Miyazaki
S. Omachi
89
0
0
11 May 2025
SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images
Yicheng Song
Tiancheng Lin
Die Peng
Su Yang
Yi Xu
MedIm
78
0
0
10 May 2025
Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation
Wenwen Qiang
Jianqi Zhang
Jingyao Wang
Changwen Zheng
VLM
139
0
0
10 May 2025
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Christos Plachouras
Julien Guinot
George Fazekas
Elio Quinton
Emmanouil Benetos
Johan Pauwels
439
1
0
09 May 2025
Adapting a Segmentation Foundation Model for Medical Image Classification
Pengfei Gu
Haoteng Tang
Islam A. Ebeid
Jose Angel Nuñez
Fabian Vazquez
Diego Adame
Marcus Zhan
Huimin Li
Bin Fu
Danny Chen
MedIm
VLM
74
0
0
09 May 2025
HyperspectralMAE: The Hyperspectral Imagery Classification Model using Fourier-Encoded Dual-Branch Masked Autoencoder
Wooyoung Jeong
Hyun Jae Park
Seonghun Jeong
Jong Wook Jang
Tae Hoon Lim
Dae Seoung Kim
54
0
0
09 May 2025
Hybrid Learning: A Novel Combination of Self-Supervised and Supervised Learning for MRI Reconstruction without High-Quality Training Reference
Haoyang Pei
Ding Xia
Xiang Xu
William Moore
Yao Wang
Hersh Chandarana
Li Feng
81
0
0
09 May 2025
Automated Learning of Semantic Embedding Representations for Diffusion Models
Limai Jiang
Yunpeng Cai
DiffM
60
0
0
09 May 2025
OWT: A Foundational Organ-Wise Tokenization Framework for Medical Imaging
Sifan Song
Siyeop Yoon
Pengfei Jin
Sekeun Kim
Matthew Tivnan
...
Zhiliang Lyu
Dufan Wu
Ning Guo
Xiang Li
Quanzheng Li
OOD
ViT
97
0
0
08 May 2025
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
Tom Sander
Moritz Tenthoff
Kay Wohlfarth
Christian Wöhler
110
0
0
08 May 2025
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
Jingzhong Lin
Yuanyuan Qi
Xinru Li
Wenxuan Huang
Xiangfeng Xu
Bangyan Li
Xuejiao Wang
Gaoqi He
66
0
0
08 May 2025
EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution
Haizhen Xie
Kunpeng Du
Qiangyu Yan
Sen Lu
Jianhong Han
Hanting Chen
Hailin Hu
Jie Hu
112
0
0
08 May 2025
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation
Abdulaziz Almuzairee
Rohan Patil
Dwait Bhatt
Henrik I. Christensen
81
0
0
07 May 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
Qingyuan Wang
Guoxin Wang
B. Cardiff
Deepu John
94
0
0
07 May 2025
HMAE: Self-Supervised Few-Shot Learning for Quantum Spin Systems
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
63
0
0
06 May 2025
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
Hafez Ghaemi
Eilif Muller
Shahab Bakhtiari
160
0
0
06 May 2025
Phenotype-Guided Generative Model for High-Fidelity Cardiac MRI Synthesis: Advancing Pretraining and Clinical Applications
Zhiyu Li
Yujian Hu
Zhengyao Ding
Yiheng Mao
Haoyang Li
Fan Yi
Hongkun Zhang
Zhengxing Huang
MedIm
82
1
0
06 May 2025
Dual-Domain Masked Image Modeling: A Self-Supervised Pretraining Strategy Using Spatial and Frequency Domain Masking for Hyperspectral Data
Shaheer Mohamed
Tharindu Fernando
Sridha Sridharan
Peyman Moghadam
Clinton Fookes
74
0
0
06 May 2025
Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach
Pierre Adorni
M. Pham
Stéphane May
Sébastien Lefèvre
80
0
0
06 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
Dengyang Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
107
0
0
05 May 2025
Always Skip Attention
Yiping Ji
Hemanth Saratchandran
Peyman Moghaddam
Simon Lucey
453
3
0
04 May 2025
Previous
1
2
3
4
5
...
94
95
96
Next