Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,611 papers shown
Title
A Unified and Scalable Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability
Jie Zhu
Jirong Zha
Ding Li
Leye Wang
17
0
0
15 May 2025
EmbodiedMAE: A Unified 3D Multi-Modal Representation for Robot Manipulation
Zibin Dong
Fei Ni
Yifu Yuan
Yinchuan Li
Jianye Hao
15
0
0
15 May 2025
Recent Advances in Medical Imaging Segmentation: A Survey
Fares Bougourzi
Abdenour Hadid
OOD
36
0
0
14 May 2025
BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis
Jiarun Liu
Hong-Yu Zhou
Weijian Huang
Hao Yang
Dongning Song
Tao Tan
Yong Liang
Shanshan Wang
MedIm
18
0
0
14 May 2025
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning
Berkay Guler
Giovanni Geraci
Hamid Jafarkhani
24
0
0
14 May 2025
Few-shot Novel Category Discovery
Chunming Li
Shidong Wang
Haofeng Zhang
26
0
0
13 May 2025
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush Rai
Kyle Min
Tarun Krishna
Feiyan Hu
A. Smeaton
Noel E. O'Connor
VGen
19
0
0
13 May 2025
VIViT: Variable-Input Vision Transformer Framework for 3D MR Image Segmentation
B. K. Das
Ajay Singh
Gengyan Zhao
Han Liu
Thomas J. Re
D. Comaniciu
Eli Gibson
Andreas K. Maier
ViT
MedIm
21
0
0
13 May 2025
An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care
Z. Soh
Yang Bai
Kai Yu
Yang Zhou
Xiaofeng Lei
...
J. Jonas
T. Y. Wong
Rick Siow Mong Goh
Yong Liu
Ching-Yu Cheng
23
0
0
13 May 2025
TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series
Xiaolei Qin
Di Wang
J. Zhang
Fengxiang Wang
Xin Su
Bo Du
Liangpei Zhang
AI4TS
16
0
0
13 May 2025
Vision Foundation Model Embedding-Based Semantic Anomaly Detection
M. Ronecker
Matthew Foutter
Amine Elhafsi
Daniele Gammelli
Ihor Barakaiev
Marco Pavone
Daniel Watzenig
24
0
0
12 May 2025
Sleep Position Classification using Transfer Learning for Bed-based Pressure Sensors
Olivier Papillon
Rafik Goubran
James Green
Julien Larivière-Chartier
Caitlin Higginson
Frank Knoefel
Rébecca Robillard
19
0
0
12 May 2025
ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning
Hongyin Zhang
Zifeng Zhuang
H. Zhao
Pengxiang Ding
Hongchao Lu
Donglin Wang
OffRL
41
0
0
12 May 2025
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence
Yu Qiao
Huy Q. Le
Avi Deb Raha
Phuong-Nam Tran
Apurba Adhikary
Mengchun Zhang
Loc X. Nguyen
Eui-nam Huh
Dusit Niyato
C. Hong
AI4CE
28
0
0
11 May 2025
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Zhengmi Tang
Yuto Mitsui
Tomo Miyazaki
S. Omachi
31
0
0
11 May 2025
SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images
Yicheng Song
Tiancheng Lin
Die Peng
Su Yang
Yi Xu
MedIm
28
0
0
10 May 2025
Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation
Jingyao Wang
Jianqi Zhang
Wenwen Qiang
Changwen Zheng
VLM
32
0
0
10 May 2025
HyperspectralMAE: The Hyperspectral Imagery Classification Model using Fourier-Encoded Dual-Branch Masked Autoencoder
Wooyoung Jeong
Hyun Jae Park
Seonghun Jeong
Jong Wook Jang
Tae Hoon Lim
Dae Seoung Kim
29
0
0
09 May 2025
Adapting a Segmentation Foundation Model for Medical Image Classification
Pengfei Gu
Haoteng Tang
Islam A. Ebeid
Jose Angel Nuñez
Fabian Vazquez
Diego Adame
Marcus Zhan
Huimin Li
Bin Fu
D. Z. Chen
MedIm
VLM
36
0
0
09 May 2025
Hybrid Learning: A Novel Combination of Self-Supervised and Supervised Learning for MRI Reconstruction without High-Quality Training Reference
Haoyang Pei
Ding Xia
Xiang Xu
William Moore
Yao Wang
Hersh Chandarana
Li Feng
29
0
0
09 May 2025
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Christos Plachouras
Julien Guinot
George Fazekas
Elio Quinton
Emmanouil Benetos
Johan Pauwels
90
1
0
09 May 2025
Automated Learning of Semantic Embedding Representations for Diffusion Models
Limai Jiang
Yunpeng Cai
DiffM
23
0
0
09 May 2025
EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution
Haizhen Xie
Kunpeng Du
Qiangyu Yan
Sen Lu
Jianhong Han
Hanting Chen
Hailin Hu
Jie Hu
48
0
0
08 May 2025
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
Tom Sander
Moritz Tenthoff
Kay Wohlfarth
Christian Wöhler
29
0
0
08 May 2025
OWT: A Foundational Organ-Wise Tokenization Framework for Medical Imaging
Sifan Song
Siyeop Yoon
Pengfei Jin
Sekeun Kim
Matthew Tivnan
...
Zhiliang Lyu
Dufan Wu
Ning Guo
Xiang Li
Quanzheng Li
OOD
ViT
61
0
0
08 May 2025
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
Jingzhong Lin
Yuanyuan Qi
Xinru Li
Wenxuan Huang
Xiangfeng Xu
Bangyan Li
Xuejiao Wang
Gaoqi He
29
0
0
08 May 2025
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation
Abdulaziz Almuzairee
Rohan Patil
Dwait Bhatt
Henrik I. Christensen
29
0
0
07 May 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency
Qingyuan Wang
Guoxin Wang
B. Cardiff
Deepu John
33
0
0
07 May 2025
HMAE: Self-Supervised Few-Shot Learning for Quantum Spin Systems
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
29
0
0
06 May 2025
Phenotype-Guided Generative Model for High-Fidelity Cardiac MRI Synthesis: Advancing Pretraining and Clinical Applications
Z. Li
Yujian Hu
Zhengyao Ding
Yiheng Mao
H. Li
Fan Yi
Hongkun Zhang
Zhengxing Huang
MedIm
33
1
0
06 May 2025
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
Hafez Ghaemi
Eilif Muller
Shahab Bakhtiari
44
0
0
06 May 2025
Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach
Pierre Adorni
M. Pham
Stéphane May
Sébastien Lefèvre
39
0
0
06 May 2025
Dual-Domain Masked Image Modeling: A Self-Supervised Pretraining Strategy Using Spatial and Frequency Domain Masking for Hyperspectral Data
Shaheer Mohamed
Tharindu Fernando
S. Sridharan
Peyman Moghadam
Clinton Fookes
31
0
0
06 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
D. Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
46
0
0
05 May 2025
Always Skip Attention
Yiping Ji
Hemanth Saratchandran
Peyman Moghaddam
Simon Lucey
107
0
0
04 May 2025
Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study
Ali Mammadov
Loic Le Folgoc
Julien Adam
Anne Buronfosse
Gilles Hayem
Guillaume Hocquet
Pietro Gori
SSL
40
0
0
02 May 2025
ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow
Changhe Chen
Quantao Yang
Xiaohao Xu
Nima Fazeli
Olov Andersson
24
0
0
02 May 2025
Contextures: Representations from Contexts
Runtian Zhai
Kai Yang
Che-Ping Tsai
Burak Varici
Zico Kolter
Pradeep Ravikumar
92
0
0
02 May 2025
A Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation viaSynergistic Pseudo-Labeling and Generative Learning
Anan Yaghmour
Melba M. Crawford
Saurabh Prasad
24
0
0
02 May 2025
Enhancing User Sequence Modeling through Barlow Twins-based Self-Supervised Learning
Yuhan Liu
Lin Ning
Neo Wu
Karan Singhal
Philip Mansfield
D. Berlowitz
Sushant Prakash
Bradley Green
SSL
49
0
0
02 May 2025
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
Wufei Ma
Luoxin Ye
Nessa McWeeney
Celso M de Melo
A. Yuille
Jieneng Chen
LRM
57
1
0
01 May 2025
Multimodal Masked Autoencoder Pre-training for 3D MRI-Based Brain Tumor Analysis with Missing Modalities
Lucas Robinet
Ahmad Berjaoui
Elizabeth Cohen-Jonathan Moyal
23
0
0
01 May 2025
Protocol-agnostic and Data-free Backdoor Attacks on Pre-trained Models in RF Fingerprinting
Tianya Zhao
Ningning Wang
Junqing Zhang
Xuyu Wang
AAML
43
0
0
01 May 2025
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Muyi Bao
Shuchang Lyu
Zhaoyang Xu
Huiyu Zhou
Jinchang Ren
Shiming Xiang
X. Li
Guangliang Cheng
Mamba
72
0
0
01 May 2025
AdCare-VLM: Leveraging Large Vision Language Model (LVLM) to Monitor Long-Term Medication Adherence and Care
Md Asaduzzaman Jabin
Hanqi Jiang
Y. Li
Patrick Kaggwa
Eugene Douglass
Juliet N. Sekandi
Tianming Liu
LM&MA
71
0
0
01 May 2025
Embracing Collaboration Over Competition: Condensing Multiple Prompts for Visual In-Context Learning
J. Wang
Tianci Luo
Yaohua Zha
Yan Feng
Ruisheng Luo
B. Chen
Tao Dai
Long Chen
Yaowei Wang
Shu-Tao Xia
VLM
50
0
0
30 Apr 2025
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Haifeng Huang
Xinyi Chen
Y. Chen
H. Li
Xiaoshen Han
Z. Wang
Tai Wang
Jiangmiao Pang
Zhou Zhao
LM&Ro
75
0
0
30 Apr 2025
Wireless Communication as an Information Sensor for Multi-agent Cooperative Perception: A Survey
Zhiying Song
Tenghui Xie
Fuxi Wen
Jun Li
38
0
0
30 Apr 2025
Insulin Resistance Prediction From Wearables and Routine Blood Biomarkers
Ahmed A. Metwally
A. Heydari
Daniel J. McDuff
Alexandru Solot
Zeinab Esmaeilpour
...
David B. Savage
C. Heneghan
Shwetak N. Patel
Cathy Speed
Javier L. Prieto
24
1
0
30 Apr 2025
GarmentX: Autoregressive Parametric Representations for High-Fidelity 3D Garment Generation
Jingfeng Guo
J. Chen
Weikai Chen
Zhenyu Sun
Lanjiong Li
Baozhu Zhao
Lingting Zhu
X. Wang
Qi Liu
3DH
80
0
0
29 Apr 2025
1
2
3
4
...
91
92
93
Next