Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.14294
Cited By
v1
v2 (latest)
Emerging Properties in Self-Supervised Vision Transformers
29 April 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Emerging Properties in Self-Supervised Vision Transformers"
50 / 4,175 papers shown
Title
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
101
29
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
200
18
0
30 May 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
108
23
0
28 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian Sun
Weiming Hu
ViT
148
46
0
28 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
75
15
0
28 May 2022
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
Yixuan Wei
Han Hu
Zhenda Xie
Zheng Zhang
Yue Cao
Jianmin Bao
Dong Chen
B. Guo
CLIP
158
128
0
27 May 2022
Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos
Gautam Singh
Yi-Fu Wu
Sungjin Ahn
OCL
147
121
0
27 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
84
49
0
27 May 2022
Semantic-aware Dense Representation Learning for Remote Sensing Image Change Detection
Hao Chen
Wenyuan Li
Songhang Chen
Zhenwei Shi
96
39
0
27 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
241
703
0
26 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
125
72
0
26 May 2022
HIRL: A General Framework for Hierarchical Image Representation Learning
Minghao Xu
Yuanfan Guo
Xuanyu Zhu
Jiawen Li
Zhenbang Sun
Jiangtao Tang
Yi Xu
Bingbing Ni
SSL
32
3
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
67
55
0
26 May 2022
Does Your Model Classify Entities Reasonably? Diagnosing and Mitigating Spurious Correlations in Entity Typing
Nan Xu
Fei Wang
Bangzheng Li
Mingtao Dong
Muhao Chen
85
21
0
25 May 2022
Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods
Randall Balestriero
Yann LeCun
SSL
112
135
0
23 May 2022
Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
Ekdeep Singh Lubana
Chi Ian Tang
F. Kawsar
Robert P. Dick
Akhil Mathur
FedML
75
54
0
23 May 2022
Active Learning Through a Covering Lens
Ofer Yehuda
Avihu Dekel
Guy Hacohen
D. Weinshall
86
50
0
23 May 2022
Continual Barlow Twins: continual self-supervised learning for remote sensing semantic segmentation
V. Marsocci
Simone Scardapane
CLL
95
27
0
23 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
285
368
0
21 May 2022
A Study on Transformer Configuration and Training Objective
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
87
9
0
21 May 2022
Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)
Jue Jiang
N. Tyagi
K. Tringale
C. Crane
Harini Veeraraghavan
MedIm
104
37
0
20 May 2022
Learning to Count Anything: Reference-less Class-agnostic Counting with Weak Supervision
Michael A. Hobley
V. Prisacariu
124
42
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
75
54
0
19 May 2022
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLM
ViT
203
11
0
19 May 2022
K-textures, a self-supervised hard clustering deep learning algorithm for satellite image segmentation
F. Wagner
Ricardo Dalagnol
Alber H. Sánchez
Mayumi C. M. Hirye
Samuel Favrichon
Jake H. Lee
S. Mauceri
Yan Yang
Sassan Saatchi
97
12
0
18 May 2022
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers
Arda Sahiner
Tolga Ergen
Batu Mehmet Ozturkler
John M. Pauly
Morteza Mardani
Mert Pilanci
132
33
0
17 May 2022
Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion
Subhabrata Choudhury
Laurynas Karazija
Iro Laina
Andrea Vedaldi
Christian Rupprecht
OCL
VOS
177
41
0
16 May 2022
Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
Luke Melas-Kyriazi
Christian Rupprecht
Iro Laina
Andrea Vedaldi
122
168
0
16 May 2022
Scalable Vehicle Re-Identification via Self-Supervision
Pirazh Khorramshahi
Vineet Shenoy
Rama Chellappa
45
0
0
16 May 2022
Toward a Geometrical Understanding of Self-supervised Contrastive Learning
Romain Cosentino
Anirvan M. Sengupta
Salman Avestimehr
Mahdi Soltanolkotabi
Antonio Ortega
Ted Willke
Mariano Tepper
SSL
97
17
0
13 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
116
35
0
12 May 2022
DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision
Erik Wallin
Lennart Svensson
Fredrik Kahl
Lars Hammarstrand
SSL
68
13
0
11 May 2022
Multiplexed Immunofluorescence Brain Image Analysis Using Self-Supervised Dual-Loss Adaptive Masked Autoencoder
S. Ly
Bai Lin
Hung Q. Vo
D. Maric
B. Roysam
H. V. Nguyen
53
0
0
10 May 2022
Learning to Answer Visual Questions from Web Videos
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
89
35
0
10 May 2022
Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains
Haiyang Yang
Meilin Chen
Yizhou Wang
Shixiang Tang
Feng Zhu
Lei Bai
Rui Zhao
Wanli Ouyang
73
19
0
10 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
Peng Gao
Teli Ma
Hongsheng Li
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
79
128
0
08 May 2022
Quantification of Robotic Surgeries with Vision-Based Deep Learning
Dani Kiyasseh
Runzhuo Ma
Taseen F. Haque
J. Nguyen
C. Wagner
Anima Anandkumar
A. Hung
MedIm
49
3
0
06 May 2022
Self-Supervised Learning for Invariant Representations from Multi-Spectral and SAR Images
P. Jain
Bianca Schoen-Phelan
R. Ross
70
34
0
04 May 2022
A Comprehensive Survey of Image Augmentation Techniques for Deep Learning
Mingle Xu
Sook Yoon
A. Fuentes
D. Park
VLM
128
432
0
03 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
105
39
0
02 May 2022
Continual Learning with Foundation Models: An Empirical Study of Latent Replay
O. Ostapenko
Timothée Lesort
P. Rodríguez
Md Rifat Arefin
Arthur Douillard
Irina Rish
Laurent Charlin
98
53
0
30 Apr 2022
Dynamic Curriculum Learning for Great Ape Detection in the Wild
Xinyu Yang
T. Burghardt
Majid Mirmehdi
95
14
0
30 Apr 2022
Depth Estimation with Simplified Transformer
John Yang
Le An
Anurag Dixit
Jinkyu Koo
Su Inn Park
MDE
77
21
0
28 Apr 2022
Offline Visual Representation Learning for Embodied Navigation
Karmesh Yadav
Ram Ramrakhya
Arjun Majumdar
Vincent-Pierre Berges
Sachit Kuhar
Dhruv Batra
Alexei Baevski
Oleksandr Maksymets
OffRL
SSL
111
79
0
27 Apr 2022
Self-Supervised Learning of Object Parts for Semantic Segmentation
A. Ziegler
Yuki M. Asano
SSL
OCL
117
103
0
27 Apr 2022
Understanding The Robustness in Vision Transformers
Daquan Zhou
Zhiding Yu
Enze Xie
Chaowei Xiao
Anima Anandkumar
Jiashi Feng
J. Álvarez
ViT
154
193
0
26 Apr 2022
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval
Yuying Ge
Yixiao Ge
Xihui Liu
Alex Jinpeng Wang
Jianping Wu
Ying Shan
Xiaohu Qie
Ping Luo
VLM
81
44
0
26 Apr 2022
Context-Aware Sequence Alignment using 4D Skeletal Augmentation
Taein Kwon
Bugra Tekin
Siyu Tang
Marc Pollefeys
74
14
0
26 Apr 2022
ATST: Audio Representation Learning with Teacher-Student Transformer
Xian Li
Xiaofei Li
ViT
58
22
0
26 Apr 2022
RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
Xiaojian Ma
Weili Nie
Zhiding Yu
Huaizu Jiang
Chaowei Xiao
Yuke Zhu
Song-Chun Zhu
Anima Anandkumar
ViT
LRM
131
19
0
24 Apr 2022
Previous
1
2
3
...
75
76
77
...
82
83
84
Next