Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,778 papers shown
Title
Efficient Pretraining Model based on Multi-Scale Local Visual Field Feature Reconstruction for PCB CT Image Element Segmentation
Chen Chen
Kai Qiao
Jie Yang
Jian Chen
Bin Yan
48
2
0
09 May 2024
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
Lingdong Kong
You-Chen Liu
Lai Xing Ng
Benoit R. Cottereau
Wei Tsang Ooi
VLM
87
17
0
08 May 2024
EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning
Jingfeng Yao
Xinggang Wang
Yuehao Song
Huangxuan Zhao
Jun Ma
Yajie Chen
Wenyu Liu
Bo Wang
ViT
82
6
0
08 May 2024
StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer
Zijia Wang
Zhi-Song Liu
Mamba
80
8
0
08 May 2024
TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking
Pengcheng Shao
Tianyang Xu
Zhangyong Tang
Linze Li
Xiao-Jun Wu
Josef Kittler
101
6
0
08 May 2024
BenthicNet: A global compilation of seafloor images for deep learning applications
Joakim Bruslund Haurum
B. Misiuk
Isaac Xu
Shakhboz Abdulazizov
A. R. Baroi
...
Jordan A. Thomson
Brittany R. Wilson
Melisa C. Wong
Craig J. Brown
Thomas Trappenberg
106
4
0
08 May 2024
UniGen: Unified Modeling of Initial Agent States and Trajectories for Generating Autonomous Driving Scenarios
R. Mahjourian
Rongbing Mu
Valerii Likhosherstov
Paul Mougin
Xiukun Huang
Joao Messias
Shimon Whiteson
61
8
0
06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
176
48
0
06 May 2024
Visual Language Model based Cross-modal Semantic Communication Systems
Feibo Jiang
Chuanguo Tang
Li Dong
Kezhi Wang
Kun Yang
Cunhua Pan
VLM
83
4
0
06 May 2024
Class-relevant Patch Embedding Selection for Few-Shot Image Classification
Weihao Jiang
Haoyang Cui
Kun He
VLM
81
0
0
06 May 2024
Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning
Weihao Jiang
Chang-Shu Liu
Kun He
ViT
110
0
0
06 May 2024
Transformer-based RGB-T Tracking with Channel and Spatial Feature Fusion
Yunfeng Li
Bo Wang
Ye Li
ViT
117
7
0
06 May 2024
You Only Need Half: Boosting Data Augmentation by Using Partial Content
Juntao Hu
Yuan Wu
73
1
0
05 May 2024
Region-specific Risk Quantification for Interpretable Prognosis of COVID-19
Zhusi Zhong
Jie Li
Zhuoqi Ma
Scott Collins
Harrison X. Bai
Paul J Zhang
Terrance Healey
Xinbo Gao
Michael Atalay
Zhicheng Jiao
42
0
0
05 May 2024
MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning
Vishal Nedungadi
A. Kariryaa
Stefan Oehmcke
Serge Belongie
Christian Igel
Nico Lang
110
28
0
04 May 2024
Vision-based 3D occupancy prediction in autonomous driving: a review and outlook
Yanan Zhang
Jinqing Zhang
Zengran Wang
Junhao Xu
Di Huang
77
18
0
04 May 2024
A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model
Weiqi Zhang
Jiexia Ye
Ke Yi
Yongzi Yu
Ziyue Li
Jia Li
Fugee Tsung
AI4TS
AI4CE
96
29
0
03 May 2024
Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design
Jian Meng
Yuan Liao
Anupreetham Anupreetham
Ahmed Hassan
Shixing Yu
Han-Sok Suh
Xiaofeng Hu
Jae-sun Seo
MQ
91
2
0
02 May 2024
Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey
Guoping Xu
Xiaxia Wang
Xinglong Wu
Xuesong Leng
Yongchao Xu
3DPC
95
11
0
02 May 2024
Adapting Self-Supervised Learning for Computational Pathology
Eric Zimmermann
Neil Tenenholtz
James Hall
George Shaikovski
Michal Zelechowski
...
Fausto Milletari
Julian Viret
Eugene Vorontsov
Siqi Liu
Kristen Severson
OOD
75
1
0
02 May 2024
Self-Supervised Learning for Interventional Image Analytics: Towards Robust Device Trackers
Saahil Islam
Venkatesh N. Murthy
Dominik Neumann
Badhan Kumar Das
Puneet Sharma
Andreas Maier
Dorin Comaniciu
Florin-Cristian Ghesu
103
1
0
02 May 2024
Spider: A Unified Framework for Context-dependent Concept Segmentation
Xiaoqi Zhao
Youwei Pang
Wei Ji
Baicheng Sheng
Jiaming Zuo
Lihe Zhang
Huchuan Lu
98
8
0
02 May 2024
SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters
Shengsheng Lin
Weiwei Lin
Wentai Wu
Haojun Chen
Junjie Yang
109
54
0
02 May 2024
CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation
Chenying Liu
C. Albrecht
Yi Wang
Xiao Xiang Zhu
247
3
0
02 May 2024
GraCo: Granularity-Controllable Interactive Segmentation
Yian Zhao
Kehan Li
Ze-Long Cheng
Pengchong Qiao
Xiawu Zheng
Rongrong Ji
Chang Liu
Li-ming Yuan
Jie Chen
112
9
0
01 May 2024
Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable
Haozhe Liu
Wentian Zhang
Bing Li
Bernard Ghanem
Jürgen Schmidhuber
DiffM
WIGM
AAML
83
1
0
01 May 2024
Exploring Self-Supervised Vision Transformers for Deepfake Detection: A Comparative Analysis
H. Nguyen
Junichi Yamagishi
Isao Echizen
123
9
0
01 May 2024
Training a high-performance retinal foundation model with half-the-data and 400 times less compute
Justin Engelmann
Miguel O. Bernabeu
MedIm
OOD
123
1
0
30 Apr 2024
Masked Multi-Query Slot Attention for Unsupervised Object Discovery
Rishav Pramanik
José-Fabian Villa-Vásquez
M. Pedersoli
OCL
120
0
0
30 Apr 2024
Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model
Denys Godwin
Hanxi Li
Michael Cecil
Hamed Alemohammad
61
2
0
30 Apr 2024
On Improving the Algorithm-, Model-, and Data- Efficiency of Self-Supervised Learning
Yunhao Cao
Jianxin Wu
70
0
0
30 Apr 2024
Integrating Present and Past in Unsupervised Continual Learning
Yipeng Zhang
Laurent Charlin
R. Zemel
Mengye Ren
CLL
79
4
0
29 Apr 2024
MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection
H. R. Medeiros
David Latortue
Fidel Alejandro Guerrero Peña
Eric Granger
M. Pedersoli
58
0
0
29 Apr 2024
Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models
Xingyuan Zhang
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
OffRL
109
0
0
29 Apr 2024
MultiMAE-DER: Multimodal Masked Autoencoder for Dynamic Emotion Recognition
Peihao Xiang
Chaohao Lin
Kaida Wu
Ou Bai
95
3
0
28 Apr 2024
Position: Do Not Explain Vision Models Without Context
Paulina Tomaszewska
Przemysław Biecek
68
1
0
28 Apr 2024
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
Dingzhe Li
Yixiang Jin
A. Yong
Yong A
Hongze Yu
...
Huaping Liu
Gang Hua
F. Sun
Jianwei Zhang
Bin Fang
AI4CE
LM&Ro
222
15
0
28 Apr 2024
Pre-training on High Definition X-ray Images: An Experimental Study
Tianlin Li
Yuehang Li
Wentao Wu
Jiandong Jin
Yao Rong
Bowei Jiang
Chuanfu Li
Jin Tang
MedIm
ViT
LM&MA
129
3
0
27 Apr 2024
Masked Two-channel Decoupling Framework for Incomplete Multi-view Weak Multi-label Learning
Chengliang Liu
Jie Wen
Yabo Liu
Chao Huang
Zhihao Wu
Xiaoling Luo
Yong-mei Xu
83
8
0
26 Apr 2024
SAGHOG: Self-Supervised Autoencoder for Generating HOG Features for Writer Retrieval
Marco Peer
Florian Kleber
Robert Sablatnig
97
1
0
26 Apr 2024
Self-supervised visual learning in the low-data regime: a comparative evaluation
Sotirios Konstantakos
Despina Ioanna Chalkiadaki
Ioannis Mademlis
Yuki M. Asano
E. Gavves
Georgios Th. Papadopoulos
127
6
0
26 Apr 2024
Made to Order: Discovering monotonic temporal changes via self-supervised video ordering
Charig Yang
Weidi Xie
Andrew Zisserman
82
2
0
25 Apr 2024
Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals
Oliver Hahn
Nikita Araslanov
Simone Schaub-Meyer
Stefan Roth
3DPC
79
4
0
25 Apr 2024
Road Surface Friction Estimation for Winter Conditions Utilising General Visual Features
Risto Ojala
Eerik Alamikkotervo
33
1
0
25 Apr 2024
Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud
Ayumu Saito
Prachi Kudeshia
Jiju Poovvancheri
3DPC
152
9
0
25 Apr 2024
Robust Fine-tuning for Pre-trained 3D Point Cloud Models
Zhibo Zhang
Ximing Yang
Weizhong Zhang
Cheng Jin
3DPC
107
1
0
25 Apr 2024
Exploring Learngene via Stage-wise Weight Sharing for Initializing Variable-sized Models
Shiyu Xia
Wenxuan Zhu
Xu Yang
Xin Geng
61
2
0
25 Apr 2024
Editable Image Elements for Controllable Synthesis
Jiteng Mu
Michael Gharbi
Richard Zhang
Eli Shechtman
Nuno Vasconcelos
Xiaolong Wang
Taesung Park
DiffM
92
9
0
24 Apr 2024
Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains
Eunsu Baek
Keondo Park
Jiyoon Kim
Hyung-Sin Kim
OODD
OOD
122
6
0
24 Apr 2024
Where to Mask: Structure-Guided Masking for Graph Masked Autoencoders
Chuang Liu
Yuyao Wang
Yibing Zhan
Xueqi Ma
Dapeng Tao
Hongzhi Zhang
Wenbin Hu
99
5
0
24 Apr 2024
Previous
1
2
3
...
33
34
35
...
94
95
96
Next