Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.06377
Cited By
Masked Autoencoders Are Scalable Vision Learners
11 November 2021
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Autoencoders Are Scalable Vision Learners"
50 / 4,611 papers shown
Title
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
31
159
0
04 Mar 2022
ViT-P: Rethinking Data-efficient Vision Transformers from Locality
B. Chen
Ran A. Wang
Di Ming
Xin Feng
ViT
13
7
0
04 Mar 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
S. Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TS
VLM
16
36
0
03 Mar 2022
Instance Segmentation for Autonomous Log Grasping in Forestry Operations
Jean-Michel Fortin
Olivier Gamache
Vincent Grondin
F. Pomerleau
Philippe Giguère
19
22
0
03 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
24
106
0
02 Mar 2022
Learning Moving-Object Tracking with FMCW LiDAR
Yinjuan Gu
Hongzhi Cheng
Kafeng Wang
Dejing Dou
Chengzhong Xu
Hui Kong
13
6
0
02 Mar 2022
LISA: Learning Interpretable Skill Abstractions from Language
Divyansh Garg
Skanda Vaidyanath
Kuno Kim
Jiaming Song
Stefano Ermon
LM&Ro
OffRL
145
29
0
28 Feb 2022
Unsupervised Point Cloud Representation Learning with Deep Neural Networks: A Survey
Aoran Xiao
Jiaxing Huang
Dayan Guan
Xiaoqin Zhang
Shijian Lu
Ling Shao
3DPC
13
70
0
28 Feb 2022
Reconstruction Task Finds Universal Winning Tickets
Ruichen Li
Binghui Li
Qi Qian
Liwei Wang
16
0
0
23 Feb 2022
HiP: Hierarchical Perceiver
João Carreira
Skanda Koppula
Daniel Zoran
Adrià Recasens
Catalin Ionescu
...
M. Botvinick
Oriol Vinyals
Karen Simonyan
Andrew Zisserman
Andrew Jaegle
VLM
21
14
0
22 Feb 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
22
229
0
21 Feb 2022
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
17
636
0
20 Feb 2022
Masked prediction tasks: a parameter identifiability view
Bingbin Liu
Daniel J. Hsu
Pradeep Ravikumar
Andrej Risteski
SSL
OOD
13
4
0
18 Feb 2022
Graph Masked Autoencoders with Transformers
Sixiao Zhang
Hongxu Chen
Haoran Yang
Xiangguo Sun
Philip S. Yu
Guandong Xu
13
18
0
17 Feb 2022
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Priya Goyal
Quentin Duval
Isaac Seessel
Mathilde Caron
Ishan Misra
Levent Sagun
Armand Joulin
Piotr Bojanowski
VLM
SSL
21
110
0
16 Feb 2022
Should You Mask 15% in Masked Language Modeling?
Alexander Wettig
Tianyu Gao
Zexuan Zhong
Danqi Chen
CVBM
26
160
0
16 Feb 2022
Meta Knowledge Distillation
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
18
25
0
16 Feb 2022
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Licheng Yu
Jun Chen
Animesh Sinha
Mengjiao MJ Wang
Hugo Chen
Tamara L. Berg
Ning Zhang
VLM
23
39
0
15 Feb 2022
AI can evolve without labels: self-evolving vision transformer for chest X-ray diagnosis through knowledge distillation
S. Park
Gwanghyun Kim
Y. Oh
J. Seo
Sang Min Lee
Jin Hwan Kim
Sungjun Moon
Jae-Kwang Lim
Changhyun Park
Jong Chul Ye
ViT
MedIm
17
49
0
13 Feb 2022
MaskGIT: Masked Generative Image Transformer
Huiwen Chang
Han Zhang
Lu Jiang
Ce Liu
William T. Freeman
ViT
6
618
0
08 Feb 2022
How to Understand Masked Autoencoders
Shuhao Cao
Peng-Tao Xu
David A. Clifton
13
40
0
08 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
27
835
0
07 Feb 2022
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Yuxin Fang
Li Dong
Hangbo Bao
Xinggang Wang
Furu Wei
17
86
0
07 Feb 2022
Robust Semantic Communications Against Semantic Noise
Qiyu Hu
Guangyi Zhang
Zhijin Qin
Yunlong Cai
Guanding Yu
Geoffrey Ye Li
AAML
10
80
0
07 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
34
849
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
35
386
0
07 Feb 2022
Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Chung-Cheng Chiu
James Qin
Yu Zhang
Jiahui Yu
Yonghui Wu
SSL
14
162
0
03 Feb 2022
AtmoDist: Self-supervised Representation Learning for Atmospheric Dynamics
Sebastian Hoffmann
C. Lessig
AI4Cl
24
8
0
02 Feb 2022
Adversarial Masking for Self-Supervised Learning
Yuge Shi
N. Siddharth
Philip H. S. Torr
Adam R. Kosiorek
SSL
46
82
0
31 Jan 2022
A Frustratingly Simple Approach for End-to-End Image Captioning
Ziyang Luo
Yadong Xi
Rongsheng Zhang
Jing Ma
VLM
MLLM
17
16
0
30 Jan 2022
Research on Patch Attentive Neural Process
Xiaohan Yu
Shao‐Chen Mao
12
1
0
29 Jan 2022
Mask-based Latent Reconstruction for Reinforcement Learning
Tao Yu
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
17
44
0
28 Jan 2022
ShapeFormer: Transformer-based Shape Completion via Sparse Representation
Xingguang Yan
Liqiang Lin
Niloy J. Mitra
Dani Lischinski
Daniel Cohen-Or
Hui Huang
ViT
56
112
0
25 Jan 2022
Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
Zeyang Sha
Xinlei He
Ning Yu
Michael Backes
Yang Zhang
23
34
0
19 Jan 2022
TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval
Yue Ruan
Han-Hung Lee
Yiming Zhang
Ke Zhang
Angel X. Chang
22
22
0
19 Jan 2022
RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training
Luyang Wang
Feng Liang
Yangguang Li
Honggang Zhang
Wanli Ouyang
Jing Shao
ViT
23
24
0
18 Jan 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
102
0
16 Jan 2022
Transferability in Deep Learning: A Survey
Junguang Jiang
Yang Shu
Jianmin Wang
Mingsheng Long
OOD
17
100
0
15 Jan 2022
ViT2Hash: Unsupervised Information-Preserving Hashing
Qinkang Gong
Liangdao Wang
Hanjiang Lai
Yan Pan
Jian Yin
11
4
0
14 Jan 2022
Time Series Generation with Masked Autoencoder
Meng-yue Zha
SiuTim Wong
Mengqi Liu
Tong Zhang
Kani Chen
SyDa
AI4TS
25
17
0
14 Jan 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
40
4,945
0
10 Jan 2022
VGAER: Graph Neural Network Reconstruction based Community Detection
Chenyang Qiu
Zhaoci Huang
Wenzhe Xu
Huijia Li
23
17
0
08 Jan 2022
MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs
Qiaoyu Tan
Ninghao Liu
Xiao Shi Huang
Rui Chen
Soo-Hyun Choi
Xia Hu
SSL
14
39
0
07 Jan 2022
Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big Task
Quan Feng
Songcan Chen
6
5
0
07 Jan 2022
Implicit Autoencoder for Point-Cloud Self-Supervised Representation Learning
Siming Yan
Zhenpei Yang
Haoxiang Li
Chen Song
Li Guan
Hao Kang
G. Hua
Qi-Xing Huang
3DPC
29
61
0
03 Jan 2022
SLIP: Self-supervision meets Language-Image Pre-training
Norman Mu
Alexander Kirillov
David A. Wagner
Saining Xie
VLM
CLIP
37
475
0
23 Dec 2021
Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
Alaaeldin El-Nouby
Gautier Izacard
Hugo Touvron
Ivan Laptev
Hervé Jégou
Edouard Grave
SSL
11
148
0
20 Dec 2021
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
Wenbo Li
Xin Lu
Shengju Qian
Jiangbo Lu
X. Zhang
Jiaya Jia
ViT
24
83
0
19 Dec 2021
RELAX: Representation Learning Explainability
Kristoffer Wickstrøm
Daniel J. Trosten
Sigurd Løkse
Ahcène Boubekki
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
FAtt
4
14
0
19 Dec 2021
Contrastive Vision-Language Pre-training with Limited Resources
Quan Cui
Boyan Zhou
Yu Guo
Weidong Yin
Hao Wu
Osamu Yoshie
Yubo Chen
VLM
CLIP
11
32
0
17 Dec 2021
Previous
1
2
3
...
90
91
92
93
Next