Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.14294
Cited By
Emerging Properties in Self-Supervised Vision Transformers
29 April 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Emerging Properties in Self-Supervised Vision Transformers"
50 / 1,258 papers shown
Title
HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images
A. Athar
Jonathon Luiten
Alexander Hermans
Deva Ramanan
Bastian Leibe
VOS
30
25
0
16 Dec 2021
Ensembling Off-the-shelf Models for GAN Training
Nupur Kumari
Richard Y. Zhang
Eli Shechtman
Jun-Yan Zhu
34
86
0
16 Dec 2021
Unsupervised Dense Information Retrieval with Contrastive Learning
Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Edouard Grave
RALM
38
808
0
16 Dec 2021
Deep Hash Distillation for Image Retrieval
Young Kyun Jang
Geonmo Gu
ByungSoo Ko
Isaac Kang
N. Cho
21
34
0
16 Dec 2021
FIgLib & SmokeyNet: Dataset and Deep Learning Model for Real-Time Wildland Fire Smoke Detection
Anshuman Dewangan
Yash Pande
Hans-Werner Braun
F. Vernon
Ismael Pérez
I. Altintas
G. Cottrell
M. H. Nguyen
16
45
0
16 Dec 2021
Towards General and Efficient Active Learning
Yichen Xie
Masayoshi Tomizuka
Wei Zhan
VLM
35
10
0
15 Dec 2021
Deep ViT Features as Dense Visual Descriptors
Shirzad Amir
Yossi Gandelsman
Shai Bagon
Tali Dekel
MDE
ViT
36
273
0
10 Dec 2021
Label, Verify, Correct: A Simple Few Shot Object Detection Method
Prannay Kaul
Weidi Xie
Andrew Zisserman
ObjD
20
81
0
10 Dec 2021
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
40
687
0
08 Dec 2021
Self-Supervised Models are Continual Learners
Enrico Fini
Victor G. Turrisi da Costa
Xavier Alameda-Pineda
Elisa Ricci
Alahari Karteek
Julien Mairal
BDL
CLL
SSL
41
158
0
08 Dec 2021
Joint Learning of Localized Representations from Medical Images and Reports
Philipp Muller
Georgios Kaissis
Cong Zou
Daniel Munich
137
81
0
06 Dec 2021
Forward Compatible Training for Large-Scale Embedding Retrieval Systems
Vivek Ramanujan
Pavan Kumar Anasosalu Vasu
Ali Farhadi
Oncel Tuzel
Hadi Pouransari
VLM
32
16
0
06 Dec 2021
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
36
203
0
02 Dec 2021
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLM
CLIP
94
551
0
02 Dec 2021
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation
Zhao-Heng Yin
Pichao Wang
Fan Wang
Xianzhe Xu
Hanling Zhang
Hao Li
Rong Jin
41
39
0
02 Dec 2021
Self-supervised Video Transformer
Kanchana Ranasinghe
Muzammal Naseer
Salman Khan
F. Khan
Michael S. Ryoo
ViT
39
84
0
02 Dec 2021
Revisiting the Transferability of Supervised Pretraining: an MLP Perspective
Yizhou Wang
Shixiang Tang
Feng Zhu
Lei Bai
Rui Zhao
Donglian Qi
Wanli Ouyang
26
51
0
01 Dec 2021
Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup
Siyuan Li
Zicheng Liu
Zedong Wang
Di Wu
Zihan Liu
Stan Z. Li
35
26
0
30 Nov 2021
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
Sara Atito
Muhammad Awais
Ammarah Farooq
Zhenhua Feng
J. Kittler
17
17
0
30 Nov 2021
Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis
Yucheng Tang
Dong Yang
Wenqi Li
H. Roth
Bennett Landman
Daguang Xu
V. Nath
Ali Hatamizadeh
ViT
MedIm
42
517
0
29 Nov 2021
SWAT: Spatial Structure Within and Among Tokens
Kumara Kahatapitiya
Michael S. Ryoo
25
6
0
26 Nov 2021
Semantic-Aware Generation for Self-Supervised Visual Representation Learning
Yunjie Tian
Lingxi Xie
Xiaopeng Zhang
Jiemin Fang
Haohang Xu
Wei Huang
Jianbin Jiao
Qi Tian
QiXiang Ye
SSL
GAN
36
16
0
25 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
48
238
0
24 Nov 2021
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
41
214
0
24 Nov 2021
RegionCL: Can Simple Region Swapping Contribute to Contrastive Learning?
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
SSL
21
18
0
24 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
34
40
0
23 Nov 2021
RedCaps: web-curated image-text data created by the people, for the people
Karan Desai
Gaurav Kaul
Zubin Aysola
Justin Johnson
19
162
0
22 Nov 2021
Class-agnostic Object Detection with Multi-modal Transformer
Muhammad Maaz
H. Rasheed
Salman Khan
F. Khan
Rao Muhammad Anwer
Ming Yang
20
91
0
22 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
20
165
0
22 Nov 2021
Global and Local Alignment Networks for Unpaired Image-to-Image Translation
Guanglei Yang
H. Tang
Humphrey Shi
M. Ding
N. Sebe
Radu Timofte
Luc Van Gool
Elisa Ricci
18
1
0
19 Nov 2021
ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation
Laurynas Karazija
Iro Laina
Christian Rupprecht
3DV
VOS
33
84
0
19 Nov 2021
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
57
1,309
0
18 Nov 2021
TransMix: Attend to Mix for Vision Transformers
Jieneng Chen
Shuyang Sun
Ju He
Philip Torr
Alan Yuille
S. Bai
ViT
28
103
0
18 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
32
657
0
17 Nov 2021
LiT: Zero-Shot Transfer with Locked-image text Tuning
Xiaohua Zhai
Tianlin Li
Basil Mustafa
Andreas Steiner
Daniel Keysers
Alexander Kolesnikov
Lucas Beyer
VLM
48
543
0
15 Nov 2021
iBOT: Image BERT Pre-Training with Online Tokenizer
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
21
710
0
15 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
308
7,443
0
11 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
77
330
0
11 Nov 2021
A Relational Model for One-Shot Classification
Arturs Polis
Alexander Ilin
BDL
VLM
21
1
0
08 Nov 2021
PatchGame: Learning to Signal Mid-level Patches in Referential Games
Kamal Gupta
Gowthami Somepalli
Anubhav Gupta
Vinoj Jayasundara
Matthias Zwicker
Abhinav Shrivastava
25
3
0
02 Nov 2021
Blending Anti-Aliasing into Vision Transformer
Shengju Qian
Hao Shao
Yi Zhu
Mu Li
Jiaya Jia
26
20
0
28 Oct 2021
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
Fei Deng
Ingook Jang
Sungjin Ahn
VLM
29
62
0
27 Oct 2021
Self-supervised similarity search for large scientific datasets
G. Stein
P. Harrington
Jacqueline Blaum
Tomislav Medan
Z. Lukić
21
21
0
25 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer
Yuan Gong
Cheng-I Jeff Lai
Yu-An Chung
James R. Glass
ViT
38
268
0
19 Oct 2021
TLDR: Twin Learning for Dimensionality Reduction
Yannis Kalantidis
Carlos Lassance
Jon Almazán
Diane Larlus
SSL
27
10
0
18 Oct 2021
Understanding Dimensional Collapse in Contrastive Self-supervised Learning
Li Jing
Pascal Vincent
Yann LeCun
Yuandong Tian
SSL
25
338
0
18 Oct 2021
Self-Supervised Learning by Estimating Twin Class Distributions
Feng Wang
Tao Kong
Rufeng Zhang
Huaping Liu
Hang Li
SSL
55
16
0
14 Oct 2021
Decoupled Contrastive Learning
Chun-Hsiao Yeh
Cheng-Yao Hong
Yen-Chi Hsu
Tyng-Luh Liu
Yubei Chen
Yann LeCun
180
182
0
13 Oct 2021
Can machines learn to see without visual databases?
Alessandro Betti
Marco Gori
S. Melacci
Marcello Pelillo
Fabio Roli
VLM
37
3
0
12 Oct 2021
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning
Chongjian Ge
Youwei Liang
Yibing Song
Jianbo Jiao
Jue Wang
Ping Luo
ViT
21
36
0
11 Oct 2021
Previous
1
2
3
...
23
24
25
26
Next