Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,778 papers shown
Title
Scaling-laws for Large Time-series Models
Thomas D. P. Edwards
James Alvey
Justin Alsing
Nam H. Nguyen
Benjamin Dan Wandelt
AI4TS
AIFin
80
7
0
22 May 2024
Context and Geometry Aware Voxel Transformer for Semantic Scene Completion
Zhuopu Yu
Runmin Zhang
Jiacheng Ying
Junchen Yu
Xiaohai Hu
Lun Luo
Siyuan Cao
Hui-Liang Shen
ViT
100
15
0
22 May 2024
MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation
Zhiping Yu
Chenyang Liu
Liqin Liu
Z. Shi
Zhengxia Zou
VGen
71
16
0
22 May 2024
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
88
6
0
22 May 2024
NERULA: A Dual-Pathway Self-Supervised Learning Framework for Electrocardiogram Signal Analysis
G. Manimaran
S. Puthusserypady
Helena Domínguez
A. Atienza
J. Bardram
63
1
0
21 May 2024
BIMM: Brain Inspired Masked Modeling for Video Representation Learning
Zhifan Wan
Jie Zhang
Chang-bo Li
Shiguang Shan
95
0
0
21 May 2024
A Masked Semi-Supervised Learning Approach for Otago Micro Labels Recognition
Meng Shang
L. Dedeyne
J. Dupont
Laura Vercauteren
Nadjia Amini
...
E. Gielen
Sabine Verschueren
Carolina Varon
W. de Raedt
Bart Vanrumste
57
0
0
21 May 2024
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Zhenwei Shao
Zhou Yu
Jun Yu
Xuecheng Ouyang
Lihao Zheng
Zhenbiao Gai
Mingyang Wang
Jiajun Ding
67
11
0
20 May 2024
Rethinking Overlooked Aspects in Vision-Language Models
Yuan Liu
Le Tian
Xiao Zhou
Jie Zhou
VLM
87
2
0
20 May 2024
GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D
Ali Bahri
Moslem Yazdanpanah
Mehrdad Noori
Milad Cheraghalikhani
G. A. V. Hakim
David Osowiechi
Farzad Beizaee
Ismail Ben Ayed
Christian Desrosiers
3DPC
148
2
0
20 May 2024
Transcriptomics-guided Slide Representation Learning in Computational Pathology
Guillaume Jaume
Lukas Oldenburg
Anurag J. Vaidya
Richard J. Chen
Drew F. K. Williamson
Thomas Peeters
Andrew H. Song
Faisal Mahmood
112
30
0
19 May 2024
NubbleDrop: A Simple Way to Improve Matching Strategy for Prompted One-Shot Segmentation
Zhiyu Xu
Qingliang Chen
72
0
0
19 May 2024
Du-IN: Discrete units-guided mask modeling for decoding speech from Intracranial Neural signals
Hui Zheng
Haiteng Wang
Wei-Bang Jiang
Zhongtao Chen
Li He
Pei-Yang Lin
Peng-Hu Wei
Guo-Guang Zhao
Yun-Zhe Liu
74
2
0
19 May 2024
NetMamba: Efficient Network Traffic Classification via Pre-training Unidirectional Mamba
Tongze Wang
Xiaohui Xie
Wenduo Wang
Chuyi Wang
Youjian Zhao
Yong Cui
Mamba
73
17
0
19 May 2024
DINO as a von Mises-Fisher mixture model
Hariprasath Govindarajan
Per Sidén
Jacob Roll
Fredrik Lindsten
94
12
0
17 May 2024
Blackbox Adaptation for Medical Image Segmentation
Jay N. Paranjape
S. Sikder
S. Vedula
Vishal M. Patel
VLM
MedIm
74
1
0
17 May 2024
A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model
Mingxiang Fu
Yu Song
Jiameng Lv
Liang Cao
Peng Jia
...
Lili Wang
Shoulin Wei
Haifeng Yang
Zhenping Yi
Zhiqiang Zou
36
3
0
17 May 2024
Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers
Shengyuan Yang
Jiawang Bai
Kuofeng Gao
Yong-Liang Yang
Yiming Li
Shu-Tao Xia
AAML
SILM
109
5
0
17 May 2024
A Novel Bounding Box Regression Method for Single Object Tracking
Omar Abdelaziz
Mohamed Shehata
140
1
0
16 May 2024
Beyond Traditional Single Object Tracking: A Survey
Omar Abdelaziz
Mohamed Shehata
Mohamed Mohamed
123
1
0
16 May 2024
Libra: Building Decoupled Vision System on Large Language Models
Yifan Xu
Xiaoshan Yang
Y. Song
Changsheng Xu
MLLM
VLM
94
8
0
16 May 2024
Networking Systems for Video Anomaly Detection: A Tutorial and Survey
Jing Liu
Yang Liu
Jieyu Lin
Jielin Li
Peng Sun
Bo Hu
Liang Song
Azzedine Boukerche
Victor C.M. Leung
Victor C.M. Leung
210
13
0
16 May 2024
Cross-sensor self-supervised training and alignment for remote sensing
V. Marsocci
Nicolas Audebert
86
1
0
16 May 2024
MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding
Jiajie Teng
Huiyu Duan
Yucheng Zhu
Sijing Wu
Guangtao Zhai
69
2
0
15 May 2024
Task-adaptive Q-Face
Haomiao Sun
Mingjie He
Shiguang Shan
Hu Han
Xilin Chen
CVBM
95
4
0
15 May 2024
Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis
A. Englebert
Anne-Sophie Collin
O. Cornu
Christophe De Vleeschouwer
76
1
0
14 May 2024
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Pavan Kumar Anasosalu Vasu
Hadi Pouransari
Fartash Faghri
Oncel Tuzel
VLM
CLIP
108
6
0
14 May 2024
Efficient Vision-Language Pre-training by Cluster Masking
Zihao Wei
Zixuan Pan
Andrew Owens
VLM
93
10
0
14 May 2024
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training
Yulin Wang
Yang Yue
Rui Lu
Yizeng Han
Shiji Song
Gao Huang
VLM
114
12
0
14 May 2024
Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning
Alain Riou
Stefan Lattner
Gaëtan Hadjeres
Geoffroy Peeters
69
2
0
14 May 2024
Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research
Qinglong Cao
Yuntian Chen
Lu Lu
Hao Sun
Zhenzhong Zeng
Xiaokang Yang
Dong-juan Zhang
VLM
63
1
0
14 May 2024
Self-supervised learning improves robustness of deep learning lung tumor segmentation to CT imaging differences
Jue Jiang
Aneesh Rangnekar
Harini Veeraraghavan
OOD
77
0
0
14 May 2024
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li
Zedong Wang
Zicheng Liu
Di Wu
Cheng Tan
Jiangbin Zheng
Yufei Huang
Stan Z. Li
77
8
0
13 May 2024
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
99
59
0
13 May 2024
The Platonic Representation Hypothesis
Minyoung Huh
Brian Cheung
Tongzhou Wang
Phillip Isola
142
142
0
13 May 2024
SignAvatar: Sign Language 3D Motion Reconstruction and Generation
Lu Dong
Lipisha Chaudhary
Fei Xu
Xiao Wang
Mason Lary
Ifeoma Nwogu
SLR
66
4
0
13 May 2024
PLUTO: Pathology-Universal Transformer
Dinkar Juyal
Harshith Padigela
Chintan Shah
Daniel Shenker
Natalia Harguindeguy
...
E. Walk
J. Abel
Harsha Pokkalla
A. Beck
S. Grullon
MedIm
ViT
LM&MA
80
13
0
13 May 2024
NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images
Matthew Keller
Chi-en Amy Tai
Yuhao Chen
Pengcheng Xi
Alexander Wong
ViT
48
5
0
13 May 2024
FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival
Liangrui Pan
Yijun Peng
Yan Li
Yiyi Liang
Liwen Xu
Qingchun Liang
Shaoliang Peng
82
0
0
13 May 2024
MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders
Xue-Qiu Jiang
Sheng Jin
Xiaoqin Zhang
Ling Shao
Shijian Lu
MDE
82
7
0
13 May 2024
CrossCert: A Cross-Checking Detection Approach to Patch Robustness Certification for Deep Learning Models
Qili Zhou
Zhengyuan Wei
Haipeng Wang
Bo Jiang
William Chan
AAML
117
1
0
13 May 2024
MaskFuser: Masked Fusion of Joint Multi-Modal Tokenization for End-to-End Autonomous Driving
Yiqun Duan
Xianda Guo
Zheng Zhu
Zhen Wang
Yu-Kai Wang
Chin-Teng Lin
81
2
0
13 May 2024
Unified Video-Language Pre-training with Synchronized Audio
Shentong Mo
Haofan Wang
Huaxia Li
Xu Tang
77
2
0
12 May 2024
Replication Study and Benchmarking of Real-Time Object Detection Models
Pierre-Luc Asselin
Vincent Coulombe
William Guimont-Martin
William Larrivée-Hardy
88
0
0
11 May 2024
Open Challenges and Opportunities in Federated Foundation Models Towards Biomedical Healthcare
Xingyu Li
Lu Peng
Yuping Wang
Weihua Zhang
AI4CE
MedIm
LM&MA
114
12
0
10 May 2024
Federated Document Visual Question Answering: A Pilot Study
Khanh Nguyen
Dimosthenis Karatzas
FedML
84
0
0
10 May 2024
Learning Latent Dynamic Robust Representations for World Models
Ruixiang Sun
Hongyu Zang
Xin-hui Li
Riashat Islam
75
5
0
10 May 2024
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process
Tong Xiao
Jia-Yin Liu
Zhenya Huang
Jinze Wu
Jing Sha
Shijin Wang
Enhong Chen
AI4CE
80
4
0
10 May 2024
MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature Learning
Wenjin Zhang
Keyi Li
Sen Yang
Chenyang Gao
Wanzhao Yang
Sifan Yuan
I. Marsic
66
1
0
10 May 2024
Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition
Zuan Gao
Yuxin Wang
Yadong Qu
Boqiang Zhang
Zixiao Wang
Jianjun Xu
Hongtao Xie
ViT
74
9
0
09 May 2024
Previous
1
2
3
...
32
33
34
...
94
95
96
Next