Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Efficient RL via Disentangled Environment and Agent Representations
Kevin Gmelin
Shikhar Bahl
Russell Mendonca
Deepak Pathak
DRL
73
9
0
05 Sep 2023
SeisCLIP: A seismology foundation model pre-trained by multi-modal data for multi-purpose seismic feature extraction
Xu Si
Xinming Wu
Hanlin Sheng
Jun Zhu
Zefeng Li
66
14
0
05 Sep 2023
Hierarchical Masked 3D Diffusion Model for Video Outpainting
Fanda Fan
Chaoxu Guo
Litong Gong
Biao Wang
T. Ge
Yuning Jiang
Chunjie Luo
Jianfeng Zhan
DiffM
VGen
85
15
0
05 Sep 2023
Probabilistic Self-supervised Learning via Scoring Rules Minimization
Amirhossein Vahidi
Simon Schoßer
Lisa Wimmer
Yawei Li
B. Bischl
Eyke Hüllermeier
Mina Rezaei
SSL
73
2
0
05 Sep 2023
Empowering Low-Light Image Enhancer through Customized Learnable Priors
Naishan Zheng
Man Zhou
Yanmeng Dong
Xiangyu Rui
Jie Huang
Chongyi Li
Fengmei Zhao
115
30
0
05 Sep 2023
Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery
J. Park
Daniel Sungho Jung
Gyeongsik Moon
Kyoung Mu Lee
85
6
0
05 Sep 2023
Corgi^2: A Hybrid Offline-Online Approach To Storage-Aware Data Shuffling For SGD
Etay Livne
Gal Kaplun
Eran Malach
Shai Shalev-Schwatz
OffRL
100
0
0
04 Sep 2023
Locality-Aware Hyperspectral Classification
Fangqin Zhou
Mert Kilickaya
Joaquin Vanschoren
ViT
31
6
0
04 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
93
27
0
04 Sep 2023
Leveraging Self-Supervised Vision Transformers for Segmentation-based Transfer Function Design
Dominik Engel
Leon Sick
Timo Ropinski
ViT
84
0
0
04 Sep 2023
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers
J. Denize
Mykola Liashuha
Jaonary Rabarisoa
Astrid Orcesi
Romain Hérault
ViT
66
13
0
03 Sep 2023
Large AI Model Empowered Multimodal Semantic Communications
Feibo Jiang
Yubo Peng
Li Dong
Kezhi Wang
Kun Yang
Cunhua Pan
Xiaohu You
101
47
0
03 Sep 2023
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
Cheng Shi
Sibei Yang
VLM
ObjD
89
39
0
03 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Qi Han
Yuxuan Cai
Xiangyu Zhang
123
8
0
02 Sep 2023
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
VLM
125
27
0
02 Sep 2023
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
Ziyu Guo
Renrui Zhang
Xiangyang Zhu
Yiwen Tang
Xianzheng Ma
...
Ke Chen
Peng Gao
Xianzhi Li
Hongsheng Li
Pheng-Ann Heng
MLLM
110
146
0
01 Sep 2023
Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction
Peizhen Bai
Xianyuan Liu
Haiping Lu
ViT
AI4CE
73
2
0
01 Sep 2023
A Locality-based Neural Solver for Optical Motion Capture
Xiaoyu Pan
Bowen Zheng
Xinwei Jiang
Guanglong Xu
Xianli Gu
...
Qilong Kou
He Wang
Tianjia Shao
Kun Zhou
Xiaogang Jin
48
5
0
01 Sep 2023
RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion
Zhiqiang Yan
Xiang Li
Le Hui
Zhenyu Zhang
Jun Yu Li
Jian Yang
VLM
3DV
113
5
0
01 Sep 2023
RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability
Chuning Zhu
Max Simchowitz
Siri Gadipudi
Abhishek Gupta
107
14
0
31 Aug 2023
TouchStone: Evaluating Vision-Language Models by Language Models
Shuai Bai
Shusheng Yang
Jinze Bai
Peng Wang
Xing Zhang
Junyang Lin
Xinggang Wang
Chang Zhou
Jingren Zhou
MLLM
119
48
0
31 Aug 2023
SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
Jiaben Chen
Huaizu Jiang
3DH
79
7
0
31 Aug 2023
Masked Transformer for Electrocardiogram Classification
Ya Zhou
Xiaolin Diao
Yanni Huo
Yang Liu
Xiaohan Fan
Wei Zhao
MedIm
76
2
0
31 Aug 2023
CL-MAE: Curriculum-Learned Masked Autoencoders
Neelu Madan
Nicolae-Cătălin Ristea
Kamal Nasrollahi
T. Moeslund
Radu Tudor Ionescu
106
12
0
31 Aug 2023
Self-Sampling Meta SAM: Enhancing Few-shot Medical Image Segmentation with Meta-Learning
Yiming Zhang
Tianang Leng
Kun Han
Xiaohui Xie
101
19
0
31 Aug 2023
Emergence of Segmentation with Minimalistic White-Box Transformers
Yaodong Yu
Tianzhe Chu
Shengbang Tong
Ziyang Wu
Druv Pai
Sam Buchanan
Yi Ma
ViT
50
22
0
30 Aug 2023
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion
Man Zhou
Jie Huang
Naishan Zheng
Chongyi Li
37
7
0
30 Aug 2023
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
Yifan Xu
Mengdan Zhang
Xiaoshan Yang
Changsheng Xu
ObjD
84
5
0
30 Aug 2023
Towards a Rigorous Analysis of Mutual Information in Contrastive Learning
Kyungeun Lee
Jaeill Kim
Suhyun Kang
Wonjong Rhee
SSL
82
2
0
30 Aug 2023
Prototype Fission: Closing Set for Robust Open-set Semi-supervised Learning
Xuwei Tan
Yi-Jie Huang
Yaqian Li
131
2
0
29 Aug 2023
A General-Purpose Self-Supervised Model for Computational Pathology
Richard J. Chen
Tong Ding
Ming Y. Lu
Drew F. K. Williamson
Guillaume Jaume
...
Judy J. Wang
Walt Williams
L. Le
Georg Gerber
Faisal Mahmood
MedIm
140
44
0
29 Aug 2023
Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation
Fu-En Yang
Chien-Yi Wang
Yu-Chiang Frank Wang
VLM
FedML
112
69
0
29 Aug 2023
Enhancing Robot Learning through Learned Human-Attention Feature Maps
D. Scheuchenstuhl
Stefan Ulmer
Felix Resch
Luigi Berducci
Radu Grosu
69
0
0
29 Aug 2023
Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification with Cross-Modal Retrieval
Seong-Hoon Eom
Namgyu Ho
Jaehoon Oh
Se-Young Yun
CLIP
VLM
75
0
0
29 Aug 2023
PronounFlow: A Hybrid Approach for Calibrating Pronouns in Sentences
Nicos Isaak
66
1
0
29 Aug 2023
When hard negative sampling meets supervised contrastive learning
Zijun Long
George Killick
R. McCreadie
Gerardo Aragon Camarasa
Zaiqiao Meng
SSL
59
3
0
28 Aug 2023
Efficient Discovery and Effective Evaluation of Visual Perceptual Similarity: A Benchmark and Beyond
Oren Barkan
Tal Reiss
Jonathan Weill
Ori Katz
Roy Hirsch
Itzik Malkiel
Noam Koenigstein
83
6
0
28 Aug 2023
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
Xudong Wang
Ishan Misra
Ziyun Zeng
Rohit Girdhar
Trevor Darrell
93
18
0
28 Aug 2023
Diversified Ensemble of Independent Sub-Networks for Robust Self-Supervised Representation Learning
Amirhossein Vahidi
Lisa Wimmer
H. Gündüz
Bernd Bischl
Eyke Hüllermeier
Mina Rezaei
OOD
UQCV
100
4
0
28 Aug 2023
Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls and Opportunities
Leman Akoglu
Jaemin Yoo
63
1
0
28 Aug 2023
A Unified Transformer-based Network for multimodal Emotion Recognition
Kamran Ali
Charles E. Hughes
85
1
0
27 Aug 2023
Nonrigid Object Contact Estimation With Regional Unwrapping Transformer
Wei Xie
Zimeng Zhao
Shiying Li
Binghui Zuo
Yangang Wang
67
4
0
27 Aug 2023
Forensic Histopathological Recognition via a Context-Aware MIL Network Powered by Self-Supervised Contrastive Learning
Chen Shen
Jun Zhang
Xinggong Liang
Zeyi Hao
Ke Li
Fan Wang
Zhenyuan Wang
C. Lian
45
2
0
27 Aug 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
141
21
0
27 Aug 2023
Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning
P. Balaji
Abhijit Das
Srijan Das
A. Dantcheva
CVBM
61
4
0
25 Aug 2023
Fine-tuning can cripple your foundation model; preserving features may be the solution
Jishnu Mukhoti
Y. Gal
Philip Torr
P. Dokania
CLL
135
47
0
25 Aug 2023
AtmoRep: A stochastic model of atmosphere dynamics using large scale representation learning
C. Lessig
Ilaria Luise
Bing Gong
M. Langguth
S. Stadtler
Martin G. Schultz
54
29
0
25 Aug 2023
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation
Yuanyou Xu
Zongxin Yang
Yi Yang
VOS
112
9
0
25 Aug 2023
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling
Shimin Zhang
Qu Yang
Chenxiang Ma
Jibin Wu
Haizhou Li
Kay Chen Tan
83
20
0
25 Aug 2023
Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation
Jiaming Zhang
Yutao Cui
Gangshan Wu
Limin Wang
VOS
130
10
0
25 Aug 2023
Previous
1
2
3
...
57
58
59
...
94
95
96
Next