Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding
Li Jiang
Zetong Yang
Shaoshuai Shi
Vladislav Golyanik
Dengxin Dai
Bernt Schiele
3DPC
114
13
0
08 May 2023
SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign Language Understanding
Hezhen Hu
Weichao Zhao
Wen-gang Zhou
Houqiang Li
ViT
95
74
0
08 May 2023
Graph Masked Autoencoder for Sequential Recommendation
Yaowen Ye
Lianghao Xia
Chao Huang
106
45
0
08 May 2023
CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation
J. Heo
S. Azizi
A. Fayyazi
Massoud Pedram
57
0
0
08 May 2023
Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting
Zhicheng Wang
Liwen Xiao
Zhiguo Cao
Hao Lu
94
15
0
08 May 2023
AdaptiveClick: Clicks-aware Transformer with Adaptive Focal Loss for Interactive Image Segmentation
Jiacheng Lin
Jiajun Chen
Kailun Yang
Alina Roitberg
Siyu Li
Zhiyong Li
Shutao Li
78
18
0
07 May 2023
Robust Image Ordinal Regression with Controllable Image Generation
Yi Cheng
Haochao Ying
Renjun Hu
Jinhong Wang
Wenhao Zheng
X. Zhang
Benlin Liu
Jian Wu
68
6
0
07 May 2023
PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos
Zhiqiang Shen
Xiaoxiao Sheng
Longguang Wang
Y. Guo
Qiong Liu
Xiaoping Zhou
3DPC
SSL
79
15
0
06 May 2023
Annotation-efficient learning for OCT segmentation
Haoran Zhang
Jianlong Yang
Ce Zheng
Shiqing Zhao
Aili Zhang
MedIm
286
8
0
06 May 2023
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey
Yichi Zhang
Rushi Jiao
MedIm
VLM
113
27
0
05 May 2023
Learn how to Prune Pixels for Multi-view Neural Image-based Synthesis
Marta Milovanović
Enzo Tartaglione
Marco Cagnazzo
F. Henry
66
0
0
05 May 2023
Random Smoothing Regularization in Kernel Gradient Descent Learning
Liang Ding
Tianyang Hu
Jiahan Jiang
Donghao Li
Wei Cao
Yuan Yao
74
6
0
05 May 2023
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
179
6
0
05 May 2023
CAMEL: Co-Designing AI Models and Embedded DRAMs for Efficient On-Device Learning
Sai Qian Zhang
Thierry Tambe
Nestor Cuevas
Gu-Yeon Wei
David Brooks
56
4
0
04 May 2023
Masked Trajectory Models for Prediction, Representation, and Control
Philipp Wu
Arjun Majumdar
Kevin Stone
Yixin Lin
Igor Mordatch
Pieter Abbeel
Aravind Rajeswaran
OffRL
67
39
0
04 May 2023
Multi-grained Hypergraph Interest Modeling for Conversational Recommendation
Chenzhang Shang
Yupeng Hou
Wayne Xin Zhao
Yaliang Li
Jing Zhang
119
12
0
04 May 2023
Revisiting the Encoding of Satellite Image Time Series
Xin Cai
Y. Bi
Peter Nicholl
Roy Sterritt
AI4TS
83
5
0
03 May 2023
SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model
Di Wang
Jing Zhang
Bo Du
Minqiang Xu
Lin Liu
Dacheng Tao
Lefei Zhang
205
71
0
03 May 2023
Representation Learning via Manifold Flattening and Reconstruction
Michael Psenka
Druv Pai
Vishal Raman
S. Sastry
Yi Ma
75
10
0
02 May 2023
UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language Models
Deming Ye
Yankai Lin
Zhengyan Zhang
Maosong Sun
KELM
58
0
0
02 May 2023
Exploring vision transformer layer choosing for semantic segmentation
Fangjian Lin
Yizhe Ma
Sheng Tian
ViT
64
4
0
02 May 2023
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation
Subhajit Maity
Sanket Biswas
Siladittya Manna
Ayan Banerjee
Josep Lladós
Saumik Bhattacharya
Umapada Pal
85
5
0
01 May 2023
What Do Self-Supervised Vision Transformers Learn?
Namuk Park
Wonjae Kim
Byeongho Heo
Taekyung Kim
Sangdoo Yun
SSL
176
81
1
01 May 2023
Part Aware Contrastive Learning for Self-Supervised Action Recognition
Yilei Hua
Wenhan Wu
Ce Zheng
Aidong Lu
Mengyuan Liu
Chong Chen
Shiqian Wu
SSL
166
39
0
01 May 2023
SLSG: Industrial Image Anomaly Detection by Learning Better Feature Embeddings and One-Class Classification
Minghui Yang
Jing Liu
Zhiwei Yang
Zhaoyang Wu
73
10
0
30 Apr 2023
Modality-invariant Visual Odometry for Embodied Vision
Marius Memmel
Roman Bachmann
Amir Zamir
127
9
0
29 Apr 2023
Prompt Engineering for Healthcare: Methodologies and Applications
Jiaqi Wang
Enze Shi
Sigang Yu
Zihao Wu
Chong Ma
...
Dajiang Zhu
Yixuan Yuan
Dinggang Shen
Tianming Liu
Shu Zhang
LM&MA
136
116
0
28 Apr 2023
Segment Anything Model for Medical Images?
Yuhao Huang
Yitian Zhao
Lei Mou
Huazhu Fu
Ao Chang
...
Lei Li
Vicente Grau
M. Akiba
Fajin Dong
Jiang-Dong Liu
VLM
146
84
0
28 Apr 2023
Auto-Linear Phenomenon in Subsurface Imaging
Yinan Feng
Yinpeng Chen
Peng Jin
Shihang Feng
Zicheng Liu
Youzuo Lin
73
7
0
27 Apr 2023
Motion-Conditioned Diffusion Model for Controllable Video Synthesis
Tsai-Shien Chen
C. Lin
Hung-Yu Tseng
Nayeon Lee
Ming-Hsuan Yang
DiffM
VGen
141
67
0
27 Apr 2023
Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking
Xin Chen
Houwen Peng
Jiawen Zhu
Dong Wang
Han Hu
Huchuan Lu
147
23
0
27 Apr 2023
Lightweight, Pre-trained Transformers for Remote Sensing Timeseries
Gabriel Tseng
Ruben Cartuyvels
Ivan Zvonkov
Mirali Purohit
David Rolnick
Hannah Kerner
147
66
0
27 Apr 2023
SkinSAM: Empowering Skin Cancer Segmentation with Segment Anything Model
Mingzhe Hu
Yuheng Li
Xiaofeng Yang
VLM
204
54
0
27 Apr 2023
Retrieval-based Knowledge Augmented Vision Language Pre-training
Jiahua Rao
Zifei Shan
Long Liu
Yao Zhou
Yuedong Yang
VLM
163
14
0
27 Apr 2023
Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning
Casey Meehan
Florian Bordes
Pascal Vincent
Kamalika Chaudhuri
Chuan Guo
77
18
0
26 Apr 2023
Compensation Learning in Semantic Segmentation
Timo Kaiser
Christoph Reinders
Bodo Rosenhahn
NoLa
71
3
0
26 Apr 2023
LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
Sheng Liu
C. P. Huynh
Congmin Chen
Maxim Arap
Raffay Hamid
114
19
0
25 Apr 2023
Objectives Matter: Understanding the Impact of Self-Supervised Objectives on Vision Transformer Representations
Shashank Shekhar
Florian Bordes
Pascal Vincent
Ari S. Morcos
83
10
0
25 Apr 2023
A Strong and Reproducible Object Detector with Only Public Datasets
Tianhe Ren
Jianwei Yang
Siyi Liu
Ailing Zeng
Feng Li
Hao Zhang
Hongyang Li
Zhaoyang Zeng
Lei Zhang
ObjD
80
11
0
25 Apr 2023
DuETT: Dual Event Time Transformer for Electronic Health Records
Alex Labach
Aslesha Pokhrel
Xiao Shi Huang
S. Zuberi
S. Yi
M. Volkovs
T. Poutanen
Rahul G. Krishnan
AI4TS
MedIm
71
3
0
25 Apr 2023
Learning imaging mechanism directly from optical microscopy observations
Ze-Hao Wang
Long-Kun Shan
Tong-Tian Weng
Tianrun Chen
Qiyuan Wang
Xiang-Dong Chen
Zhang Wang
Guanghsheng Guo
Hefei 230088
DiffM
35
1
0
25 Apr 2023
Img2Vec: A Teacher of High Token-Diversity Helps Masked AutoEncoders
Heng Pan
Chenyang Liu
Wenxiao Wang
Liejie Yuan
Hongfa Wang
Zhifeng Li
Wen Liu
VLM
64
3
0
25 Apr 2023
Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient Tuning
Zhongzhi Yu
Shang Wu
Y. Fu
Shunyao Zhang
Yingyan Lin
86
6
0
25 Apr 2023
Segment Anything in Medical Images
Jun Ma
Yuting He
Feifei Li
Li-Jun Han
Chenyu You
Bo Wang
MedIm
VLM
135
532
0
24 Apr 2023
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
163
285
0
24 Apr 2023
MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer
QiHao Zhao
Yangyu Huang
Wei Hu
Fan Zhang
Jing Liu
ViT
75
16
0
24 Apr 2023
PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation
Cilin Yan
Haochen Wang
Jie Liu
Xiaolong Jiang
Yao Hu
Xu Tang
Guoliang Kang
E. Gavves
VLM
107
0
0
23 Apr 2023
TransFlow: Transformer as Flow Learner
Yawen Lu
Qifan Wang
Siqi Ma
Tong Geng
Victor Y. Chen
Huaijin Chen
Dongfang Liu
ViT
101
50
0
23 Apr 2023
Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heights
Ibrahim Fayad
P. Ciais
Martin Schwartz
J. Wigneron
N. Baghdadi
...
Alexandre d’Aspremont
F. Frappart
Sassan Saatchi
Agnès Pellissier-Tanon
Hassan Bazzi
100
35
0
22 Apr 2023
Incomplete Multimodal Learning for Remote Sensing Data Fusion
Yuxing Chen
Maofan Zhao
Lorenzo Bruzzone
75
3
0
22 Apr 2023
Previous
1
2
3
...
68
69
70
...
94
95
96
Next