Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.12710
Cited By
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
24 November 2021
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers"
50 / 190 papers shown
Title
Improving Visual Representation Learning through Perceptual Understanding
Samyakh Tukra
Frederick Hoffman
Ken Chatfield
25
5
0
30 Dec 2022
Masked Event Modeling: Self-Supervised Pretraining for Event Cameras
Simone Klenk
David Bonello
Lukas Koestler
Nikita Araslanov
Daniel Cremers
29
23
0
20 Dec 2022
What do Vision Transformers Learn? A Visual Exploration
Amin Ghiasi
Hamid Kazemi
Eitan Borgnia
Steven Reich
Manli Shu
Micah Goldblum
A. Wilson
Tom Goldstein
ViT
28
60
0
13 Dec 2022
FastMIM: Expediting Masked Image Modeling Pre-training for Vision
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Yunhe Wang
Chang Xu
33
9
0
13 Dec 2022
CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Shuyang Gu
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
22
35
0
12 Dec 2022
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Lu Yuan
Yu-Gang Jiang
VGen
32
87
0
08 Dec 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Yuchao Gu
Xintao Wang
Yixiao Ge
Ying Shan
Xiaohu Qie
Mike Zheng Shou
DiffM
32
20
0
06 Dec 2022
MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
Lukas Hoyer
Dengxin Dai
Haoran Wang
Luc Van Gool
52
220
0
02 Dec 2022
Self-Supervised Learning based on Heat Equation
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Lu Yuan
Zicheng Liu
Youzuo Lin
29
4
0
23 Nov 2022
Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token Migration
Yunjie Tian
Lingxi Xie
Jihao Qiu
Jianbin Jiao
Yaowei Wang
Qi Tian
Qixiang Ye
ViT
36
6
0
23 Nov 2022
Contrastive Masked Autoencoders for Self-Supervised Video Hashing
Yuting Wang
Jinpeng Wang
Bin Chen
Ziyun Zeng
Shutao Xia
29
20
0
21 Nov 2022
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training
Zhenglun Kong
Haoyu Ma
Geng Yuan
Mengshu Sun
Yanyue Xie
...
Tianlong Chen
Xiaolong Ma
Xiaohui Xie
Zhangyang Wang
Yanzhi Wang
ViT
31
22
0
19 Nov 2022
CAE v2: Context Autoencoder with CLIP Target
Xinyu Zhang
Jiahui Chen
Junkun Yuan
Qiang Chen
Jian Wang
...
Jimin Pi
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
VLM
CLIP
50
24
0
17 Nov 2022
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
Tianhong Li
Huiwen Chang
Shlok Kumar Mishra
Han Zhang
Dina Katabi
Dilip Krishnan
41
152
0
16 Nov 2022
Stare at What You See: Masked Image Modeling without Reconstruction
Hongwei Xue
Peng Gao
Hongyang Li
Yu Qiao
Hao Sun
Houqiang Li
Jiebo Luo
25
31
0
16 Nov 2022
Artificial intelligence approaches for materials-by-design of energetic materials: state-of-the-art, challenges, and future directions
Joseph B. Choi
Phong C. H. Nguyen
O. Sen
H. Udaykumar
Stephen Seung-Yeob Baek
PINN
AI4CE
23
11
0
15 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
78
675
0
14 Nov 2022
Masked Contrastive Representation Learning
Yuan Yao
Nandakishor Desai
M. Palaniswami
SSL
22
8
0
11 Nov 2022
Towards Sustainable Self-supervised Learning
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
CLL
45
7
0
20 Oct 2022
A Unified View of Masked Image Modeling
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
VLM
54
35
0
19 Oct 2022
Exploring Long-Sequence Masked Autoencoders
Ronghang Hu
Shoubhik Debnath
Saining Xie
Xinlei Chen
8
18
0
13 Oct 2022
Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders
Haosen Yang
Deng Huang
Bin Wen
Jiannan Wu
H. Yao
Yi-Xin Jiang
Xiatian Zhu
Zehuan Yuan
37
19
0
09 Oct 2022
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
86
4
0
05 Oct 2022
Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders
Youngwan Lee
Jeffrey Willette
Jonghee Kim
Juho Lee
Sung Ju Hwang
31
16
0
05 Oct 2022
Attention Distillation: self-supervised vision transformer students need more guidance
Kai Wang
Fei Yang
Joost van de Weijer
ViT
24
16
0
03 Oct 2022
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks
Junke Wang
Dongdong Chen
Zuxuan Wu
Chong Luo
Luowei Zhou
Yucheng Zhao
Yujia Xie
Ce Liu
Yu-Gang Jiang
Lu Yuan
MLLM
VLM
35
148
0
15 Sep 2022
MimCo: Masked Image Modeling Pre-training with Contrastive Teacher
Qiang-feng Zhou
Chaohui Yu
Haowen Luo
Zhibin Wang
Hao Li
VLM
54
20
0
07 Sep 2022
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
VLM
54
158
0
25 Aug 2022
Video Mobile-Former: Video Recognition with Efficient Global Spatial-temporal Modeling
Rui Wang
Zuxuan Wu
Dongdong Chen
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Luowei Zhou
Lu Yuan
Yu-Gang Jiang
ViT
40
4
0
25 Aug 2022
VLMAE: Vision-Language Masked Autoencoder
Su He
Taian Guo
Tao Dai
Ruizhi Qiao
Chen Wu
Xiujun Shu
Bohan Ren
VLM
34
11
0
19 Aug 2022
Towards Label-efficient Automatic Diagnosis and Analysis: A Comprehensive Survey of Advanced Deep Learning-based Weakly-supervised, Semi-supervised and Self-supervised Techniques in Histopathological Image Analysis
Linhao Qu
Siyu Liu
Xiaoyu Liu
Manning Wang
Zhijian Song
25
56
0
18 Aug 2022
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
23
306
0
12 Aug 2022
MILAN: Masked Image Pretraining on Language Assisted Representation
Zejiang Hou
Fei Sun
Yen-kuang Chen
Yuan Xie
S. Kung
ViT
31
68
0
11 Aug 2022
Understanding Masked Image Modeling via Learning Occlusion Invariant Feature
Xiangwen Kong
Xiangyu Zhang
SSL
32
53
0
08 Aug 2022
Augmenting Vision Language Pretraining by Learning Codebook with Visual Semantics
Xiaoyuan Guo
Jiali Duan
C.-C. Jay Kuo
J. Gichoya
Imon Banerjee
VLM
19
1
0
31 Jul 2022
SdAE: Self-distillated Masked Autoencoder
Yabo Chen
Yuchen Liu
Dongsheng Jiang
Xiaopeng Zhang
Wenrui Dai
H. Xiong
Qi Tian
ViT
20
70
0
31 Jul 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
57
71
0
30 Jul 2022
Contrastive Masked Autoencoders are Stronger Vision Learners
Zhicheng Huang
Xiaojie Jin
Cheng Lu
Qibin Hou
Mingg-Ming Cheng
Dongmei Fu
Xiaohui Shen
Jiashi Feng
41
148
0
27 Jul 2022
FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Xiaoping Han
Licheng Yu
Xiatian Zhu
Li Zhang
Yi-Zhe Song
Tao Xiang
AI4TS
16
49
0
17 Jul 2022
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
22
75
0
14 Jul 2022
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision
Sanat Ramesh
V. Srivastav
Deepak Alapatt
Tong Yu
Aditya Murali
...
Saurav Sharma
A. Fleurentin
Georgios Exarchakis
Alexandros Karargyris
N. Padoy
23
42
0
01 Jul 2022
SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders
Gang Li
Heliang Zheng
Daqing Liu
Chaoyue Wang
Bing-Huang Su
Changwen Zheng
32
124
0
21 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Jiahao Xie
Wei Li
Xiaohang Zhan
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
27
69
0
15 Jun 2022
Extreme Masking for Learning Instance and Distributed Visual Representations
Zhirong Wu
Zihang Lai
Xiao Sun
Stephen Lin
32
22
0
09 Jun 2022
Spatial Entropy as an Inductive Bias for Vision Transformers
E. Peruzzo
E. Sangineto
Yahui Liu
Marco De Nadai
Wei Bi
Bruno Lepri
N. Sebe
ViT
MDE
31
1
0
09 Jun 2022
Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks
Jia-Yu Pan
Pan Zhou
Shuicheng Yan
SSL
26
15
0
08 Jun 2022
Siamese Image Modeling for Self-Supervised Vision Representation Learning
Chenxin Tao
Xizhou Zhu
Weijie Su
Gao Huang
Bin Li
Jie Zhou
Yu Qiao
Xiaogang Wang
Jifeng Dai
SSL
40
94
0
02 Jun 2022
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
21
21
0
02 Jun 2022
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Jun Chen
Ming Hu
Boyang Albert Li
Mohamed Elhoseiny
44
36
0
01 Jun 2022
Previous
1
2
3
4
Next