Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
v1
v2
v3 (latest)
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,779 papers shown
Title
Self-supervised Learning by View Synthesis
Shaoteng Liu
Xiangyu Zhang
T. Hu
Jiaya Jia
3DV
ViT
115
1
0
22 Apr 2023
Benchmarking Low-Shot Robustness to Natural Distribution Shifts
Aaditya K. Singh
Kartik Sarangmath
Prithvijit Chattopadhyay
Judy Hoffman
OOD
90
1
0
21 Apr 2023
Deep-Learning-based Fast and Accurate 3D CT Deformable Image Registration in Lung Cancer
Yuzhen Ding
H. Feng
Yunze Yang
J. Holmes
Zheng-Ning Liu
...
N. Yu
Terence T. Sio
S. Schild
Baoxin Li
Wen Liu
MedIm
44
24
0
21 Apr 2023
A vector quantized masked autoencoder for speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
113
22
0
21 Apr 2023
CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval
Shangda Wu
Dingyao Yu
Xu Tan
Maosong Sun
CLIP
VLM
76
15
0
21 Apr 2023
Med-Tuning: A New Parameter-Efficient Tuning Framework for Medical Volumetric Segmentation
Jiachen Shen
Wenxuan Wang
Chen Chen
Jianbo Jiao
Jing Liu
Yan Zhang
Shanshan Song
Jiangyun Li
114
1
0
21 Apr 2023
FreMIM: Fourier Transform Meets Masked Image Modeling for Medical Image Segmentation
Wenxuan Wang
Jing Wang
Chen Chen
Jianbo Jiao
Yuanxiu Cai
Shanshan Song
Jiangyun Li
MedIm
103
19
0
21 Apr 2023
Hyperbolic Geometry in Computer Vision: A Survey
Pengfei Fang
Mehrtash Harandi
Trung Le
Dinh Q. Phung
69
4
0
21 Apr 2023
Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget
Johannes Lehner
Benedikt Alkin
Andreas Fürst
Elisabeth Rumetshofer
Lukas Miklautz
Sepp Hochreiter
111
18
0
20 Apr 2023
An Introduction to Transformers
Richard Turner
ViT
42
0
0
20 Apr 2023
DocMAE: Document Image Rectification via Self-supervised Representation Learning
Shaokai Liu
Hao Feng
Wen-gang Zhou
Houqiang Li
Cong Liu
Feng Wu
64
6
0
20 Apr 2023
Multi-view Vision-Prompt Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?
Hao Peng
Baopu Li
Bo Zhang
Xin Chen
Tao Chen
Erik Cambria
3DPC
97
1
0
20 Apr 2023
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
Peng Cui
Dan Zhang
Zhijie Deng
Yinpeng Dong
Junyi Zhu
64
12
0
20 Apr 2023
Transformer-Based Visual Segmentation: A Survey
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
ViT
MedIm
170
147
0
19 Apr 2023
CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding
Dilxat Muhtar
Xue-liang Zhang
Pengfeng Xiao
Zhenshi Li
Feng-Xue Gu
SSL
114
59
0
19 Apr 2023
DCELANM-Net:Medical Image Segmentation based on Dual Channel Efficient Layer Aggregation Network with Learner
Cheng Lu
Z. Xia
Krzysztof Przystupa
Orest Kochan
J. Su
MedIm
75
11
0
19 Apr 2023
Denoising Cosine Similarity: A Theory-Driven Approach for Efficient Representation Learning
Takumi Nakagawa
Y. Sanada
Hiroki Waida
Yuhui Zhang
Yuichiro Wada
K. Takanashi
Tomonori Yamada
Takafumi Kanamori
DiffM
59
5
0
19 Apr 2023
Harnessing the Power of Text-image Contrastive Models for Automatic Detection of Online Misinformation
Hao Chen
Peng Zheng
Xin Wang
Shu Hu
Bin Zhu
Jinrong Hu
Xi Wu
Siwei Lyu
77
4
0
19 Apr 2023
To Compress or Not to Compress- Self-Supervised Learning and Information Theory: A Review
Ravid Shwartz-Ziv
Yann LeCun
SSL
132
75
0
19 Apr 2023
Hyperbolic Image-Text Representations
Karan Desai
Maximilian Nickel
Tanmay Rajpurohit
Justin Johnson
Ramakrishna Vedantam
VLM
109
67
0
18 Apr 2023
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning
Zheng Lian
Haiyang Sun
Guoying Zhao
Kang Chen
Mingyu Xu
...
Meng Wang
Min Zhang
Guoying Zhao
Björn W. Schuller
Jianhua Tao
96
51
0
18 Apr 2023
Self-Supervised 3D Action Representation Learning with Skeleton Cloud Colorization
Siyuan Yang
Jun Liu
Shijian Lu
Er Meng Hwa
Yongjian Hu
Alex C. Kot
3DPC
3DH
74
18
0
18 Apr 2023
W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting
Xin Man
Chenghong Zhang
Jin Feng
Changyu Li
Jie Shao
AI4Cl
117
26
0
18 Apr 2023
Pretrained Language Models as Visual Planners for Human Assistance
Dhruvesh Patel
H. Eghbalzadeh
Nitin Kamra
Michael L. Iuzzolino
Unnat Jain
Ruta Desai
LM&Ro
87
25
0
17 Apr 2023
BenchMD: A Benchmark for Unified Learning on Medical Images and Sensors
Kathryn Wantlin
Chenwei Wu
Shih-Cheng Huang
Oishi Banerjee
Farah Z. Dadabhoy
...
A. Adamson
Laura Heacock
G. Tison
Alex Tamkin
Pranav Rajpurkar
SSL
OOD
79
3
0
17 Apr 2023
Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models
Ziwei Luo
Fredrik K. Gustafsson
Zhengli Zhao
Jens Sjölund
Thomas B. Schon
86
112
0
17 Apr 2023
Deep learning universal crater detection using Segment Anything Model (SAM)
I. Giannakis
A. Bhardwaj
L. Sam
Georgios Leontidis
66
22
0
16 Apr 2023
Permutation Equivariance of Transformers and Its Applications
Hengyuan Xu
Liyao Xiang
Hang Ye
Dixi Yao
Pengzhi Chu
Baochun Li
56
15
0
16 Apr 2023
Autoencoders with Intrinsic Dimension Constraints for Learning Low Dimensional Image Representations
Jianzhang Zheng
Hao Shen
Jian Yang
Xuan Tang
Mingsong Chen
Hui Yu
Jielong Guo
Xian Wei
46
0
0
16 Apr 2023
Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging
Jielin Qiu
Peide Huang
Makiya Nakashima
Jae-Hyeok Lee
Jiacheng Zhu
...
Byung-Hak Kim
Debbie Kwon
Douglas Weber
Ding Zhao
David Chen
SSL
72
6
0
16 Apr 2023
PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute Recognition
Xinwen Fan
Yukang Zhang
Yang Lu
Hanzi Wang
ViT
81
31
0
14 Apr 2023
Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models
Yaohua Zha
Jinpeng Wang
Tao Dai
Bin Chen
Zhi Wang
Shutao Xia
VLM
117
48
0
14 Apr 2023
Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on Aerial Lidar
James M. Tolan
Hung-I Yang
Ben Nosarzewski
Guillaume Couairon
Huy Q. Vo
...
Piotr Bojanowski
T. Johns
Brian White
T. Tiecke
Camille Couprie
103
116
0
14 Apr 2023
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
540
3,536
0
14 Apr 2023
A Comparative Study on Generative Models for High Resolution Solar Observation Imaging
Mehdi Cherti
Alexander Czernik
Stefan Kesselheim
F. Effenberger
J. Jitsev
DiffM
37
0
0
14 Apr 2023
SMAE: Few-shot Learning for HDR Deghosting with Saturation-Aware Masked Autoencoders
Qingsen Yan
Song Zhang
Weiye Chen
Hao Tang
Yu Zhu
Jinqiu Sun
Luc Van Gool
Yanning Zhang
122
14
0
14 Apr 2023
3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining
Siming Yan
Yu-Qi Yang
Yu-Xiao Guo
Hao Pan
Peng-shuai Wang
Xin Tong
Yang Liu
Qi-Xing Huang
3DPC
92
14
0
14 Apr 2023
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding
Yu-Qi Yang
Yu-Xiao Guo
Jiangfeng Xiong
Yang Liu
Hao Pan
Peng-Shuai Wang
Xin Tong
B. Guo
ViT
108
88
0
14 Apr 2023
How Will It Drape Like? Capturing Fabric Mechanics from Depth Images
Carlos Rodriguez-Pardo
Melania Prieto-Martin
Dan Casas
Elena Garces
81
12
0
13 Apr 2023
DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning
Enze Xie
Lewei Yao
Han Shi
Zhili Liu
Daquan Zhou
Zhaoqiang Liu
Jiawei Li
Zhenguo Li
74
81
0
13 Apr 2023
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Mohit Sharma
Claudio Fantacci
Yuxiang Zhou
Skanda Koppula
N. Heess
Jonathan Scholz
Y. Aytar
VLM
109
31
0
13 Apr 2023
Efficient Multimodal Fusion via Interactive Prompting
Yaowei Li
Ruijie Quan
Linchao Zhu
Yezhou Yang
82
45
0
13 Apr 2023
RECLIP: Resource-efficient CLIP by Training with Small Images
Runze Li
Dahun Kim
B. Bhanu
Weicheng Kuo
VLM
CLIP
86
13
0
12 Apr 2023
Hard Patches Mining for Masked Image Modeling
Haochen Wang
Kaiyou Song
Junsong Fan
Yuxi Wang
Jin Xie
Zhaoxiang Zhang
72
64
0
12 Apr 2023
Unicom: Universal and Compact Representation Learning for Image Retrieval
Xiang An
Jiankang Deng
Kaicheng Yang
Jaiwei Li
Ziyong Feng
Jia Guo
Jing Yang
Tongliang Liu
VLM
SSL
97
28
0
12 Apr 2023
Multi-scale Geometry-aware Transformer for 3D Point Cloud Classification
Xian Wei
Muyu Wang
S. J. Lin
Zhengyu Li
Jian Yang
Arafat Al-Jawari
Xuan Tang
3DPC
ViT
69
2
0
12 Apr 2023
Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation
Yifeng Shi
Feng Lv
Xinliang Wang
Chunlong Xia
Shaojie Li
Shu-Zhen Yang
Teng Xi
Gang Zhang
VLM
156
13
0
12 Apr 2023
Learning Transferable Pedestrian Representation from Multimodal Information Supervision
Li-Na Bao
Longhui Wei
Xiaoyu Qiu
Wen-gang Zhou
Houqiang Li
Qi Tian
SSL
75
5
0
12 Apr 2023
MoMo: A shared encoder Model for text, image and multi-Modal representations
Rakesh Chada
Zhao-Heng Zheng
P. Natarajan
ViT
64
4
0
11 Apr 2023
A surprisingly simple technique to control the pretraining bias for better transfer: Expand or Narrow your representation
Florian Bordes
Samuel Lavoie
Randall Balestriero
Nicolas Ballas
Pascal Vincent
SSL
85
5
0
11 Apr 2023
Previous
1
2
3
...
69
70
71
...
94
95
96
Next