Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.02057
Cited By
An Empirical Study of Training Self-Supervised Vision Transformers
5 April 2021
Xinlei Chen
Saining Xie
Kaiming He
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Empirical Study of Training Self-Supervised Vision Transformers"
50 / 469 papers shown
Title
ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised Medical Image Representations
Chinmay Prabhakar
Hongwei Bran Li
Jiancheng Yang
Suprosana Shit
Benedikt Wiestler
Bjoern H. Menze
ViT
MedIm
39
11
0
18 Jan 2023
Vision Learners Meet Web Image-Text Pairs
Bingchen Zhao
Quan Cui
Hao Wu
Osamu Yoshie
Cheng Yang
Oisin Mac Aodha
VLM
27
5
0
17 Jan 2023
SemPPL: Predicting pseudo-labels for better contrastive representations
Matko Bovsnjak
Pierre Harvey Richemond
Nenad Tomašev
Florian Strub
Jacob Walker
Felix Hill
Lars Buesing
Razvan Pascanu
Charles Blundell
Jovana Mitrović
SSL
VLM
46
9
0
12 Jan 2023
EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata
Chenhao Zheng
Ayush Shrivastava
Andrew Owens
VLM
33
11
0
11 Jan 2023
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Keyu Tian
Yi-Xin Jiang
Qishuai Diao
Chen Lin
Liwei Wang
Zehuan Yuan
36
100
0
09 Jan 2023
Learning the Relation between Similarity Loss and Clustering Loss in Self-Supervised Learning
Jidong Ge
YuXiang Liu
Jie Gui
Lanting Fang
Ming Lin
James T. Kwok
LiGuo Huang
B. Luo
SSL
22
5
0
08 Jan 2023
CiT: Curation in Training for Effective Vision-Language Data
Hu Xu
Saining Xie
Po-Yao (Bernie) Huang
Licheng Yu
Russ Howes
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
VLM
DiffM
33
25
0
05 Jan 2023
Event Camera Data Pre-training
Yan Yang
Liyuan Pan
Liu Liu
23
32
0
05 Jan 2023
Learning Decorrelated Representations Efficiently Using Fast Fourier Transform
Yutaro Shigeto
Masashi Shimbo
Yuya Yoshikawa
A. Takeuchi
24
0
0
04 Jan 2023
Semi-MAE: Masked Autoencoders for Semi-supervised Vision Transformers
Haojie Yu
Kangnian Zhao
Xiaoming Xu
ViT
31
1
0
04 Jan 2023
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
Sucheng Ren
Fangyun Wei
Zheng-Wei Zhang
Han Hu
40
34
0
03 Jan 2023
A New Perspective to Boost Vision Transformer for Medical Image Classification
Yuexiang Li
Yawen Huang
Nanjun He
Kai Ma
Yefeng Zheng
ViT
MedIm
21
3
0
03 Jan 2023
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Sanghyun Woo
Shoubhik Debnath
Ronghang Hu
Xinlei Chen
Zhuang Liu
In So Kweon
Saining Xie
SyDa
82
727
0
02 Jan 2023
Improving Visual Representation Learning through Perceptual Understanding
Samyakh Tukra
Frederick Hoffman
Ken Chatfield
33
5
0
30 Dec 2022
Swin MAE: Masked Autoencoders for Small Datasets
Zián Xu
Yin Dai
Fayu Liu
Weibin Chen
Yue Liu
Li-Li Shi
Sheng Liu
Yuhang Zhou
SyDa
MedIm
ViT
36
28
0
28 Dec 2022
Precise Location Matching Improves Dense Contrastive Learning in Digital Pathology
Jingwei Zhang
S. Kapse
Ke Ma
Prateek Prasanna
Maria Vakalopoulou
Joel H. Saltz
Dimitris Samaras
32
9
0
23 Dec 2022
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning
J. Denize
Jaonary Rabarisoa
Astrid Orcesi
Romain Hérault
SSL
19
6
0
21 Dec 2022
Image Segmentation-based Unsupervised Multiple Objects Discovery
Sandra Kara
Hejer Ammar
Florian Chabot
Q. C. Pham
OCL
24
6
0
20 Dec 2022
MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with Informative-Preserved Reconstruction and Self-Distilled Consistency
Mingye Xu
Mutian Xu
Tong He
Wanli Ouyang
Yali Wang
Xiaoguang Han
Yu Qiao
34
10
0
20 Dec 2022
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
38
5
0
19 Dec 2022
Attentive Mask CLIP
Yifan Yang
Weiquan Huang
Yixuan Wei
Houwen Peng
Xinyang Jiang
...
Fangyun Wei
Yin Wang
Han Hu
Lili Qiu
Yuqing Yang
CLIP
VLM
42
27
0
16 Dec 2022
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Alexei Baevski
Arun Babu
Wei-Ning Hsu
Michael Auli
VLM
SSL
32
92
0
14 Dec 2022
Boosting Semi-Supervised Learning with Contrastive Complementary Labeling
Qinyi Deng
Yong Guo
Zhibang Yang
Haolin Pan
Jian Chen
35
10
0
13 Dec 2022
FastMIM: Expediting Masked Image Modeling Pre-training for Vision
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Yunhe Wang
Chang Xu
33
9
0
13 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw Data
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
M. Pantic
SSL
45
48
0
12 Dec 2022
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
Zhiheng Li
Ivan Evtimov
Albert Gordo
C. Hazirbas
Tal Hassner
Cristian Canton Ferrer
Chenliang Xu
Mark Ibrahim
34
71
0
09 Dec 2022
ViTPose++: Vision Transformer for Generic Body Pose Estimation
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
ViT
42
41
0
07 Dec 2022
Semi-Supervised Object Detection with Object-wise Contrastive Learning and Regression Uncertainty
H. Choi
Zhixiang Chen
Xuepeng Shi
Tae-Kyun Kim
19
4
0
06 Dec 2022
Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases
Mazda Moayeri
Wenxiao Wang
Sahil Singla
S. Feizi
69
14
0
05 Dec 2022
Learning Imbalanced Data with Vision Transformers
Zhengzhuo Xu
R. Liu
Shuo Yang
Zenghao Chai
Chun Yuan
37
33
0
05 Dec 2022
Med-Query: Steerable Parsing of 9-DoF Medical Anatomies with Query Embedding
Heng Guo
Jianfeng Zhang
K. Yan
Le Lu
Minfeng Xu
MedIm
19
2
0
05 Dec 2022
Joint Self-Supervised Image-Volume Representation Learning with Intra-Inter Contrastive Clustering
D. M. Nguyen
Hoangvu Nguyen
M. T. N. Truong
T. Cao
Binh Duc Nguyen
Nhat Ho
Paul Swoboda
Shadi Albarqouni
P. Xie
Daniel Sonntag
SSL
29
21
0
04 Dec 2022
Exploring Stochastic Autoregressive Image Modeling for Visual Representation
Yu-Hang Qi
Fan Yang
Yousong Zhu
Yufei Liu
Liwei Wu
Rui Zhao
Wei Li
DiffM
27
13
0
03 Dec 2022
Finetune like you pretrain: Improved finetuning of zero-shot vision models
Sachin Goyal
Ananya Kumar
Sankalp Garg
Zico Kolter
Aditi Raghunathan
CLIP
VLM
50
138
0
01 Dec 2022
Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Taihong Xiao
Zirui Wang
Liangliang Cao
Jiahui Yu
Shengyang Dai
Ming Yang
VLM
MLLM
33
5
0
29 Nov 2022
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Jiangyong Huang
William Zhu
Baoxiong Jia
Zan Wang
Xiaojian Ma
Qing Li
Siyuan Huang
40
5
0
28 Nov 2022
A Unified Framework for Contrastive Learning from a Perspective of Affinity Matrix
Wenbin Li
Meihao Kong
Xuesong Yang
Lei Wang
Jing Huo
Yang Gao
Jiebo Luo
33
0
0
26 Nov 2022
Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for Urban-Scene Segmentation
Liang Zeng
A. Lengyel
Nergis Tomen
Jan van Gemert
AI4TS
26
0
0
25 Nov 2022
Pose-disentangled Contrastive Learning for Self-supervised Facial Representation
Y. Liu
Wenbin Wang
Yibing Zhan
Shaoze Feng
Li-Yu Daisy Liu
Zhe Chen
SSL
24
13
0
24 Nov 2022
Self-Supervised Learning based on Heat Equation
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Lu Yuan
Zicheng Liu
Youzuo Lin
29
4
0
23 Nov 2022
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
Peng Jin
Jinfa Huang
Fenglin Liu
Xian Wu
Shen Ge
Guoli Song
David A. Clifton
Jing Chen
VLM
44
64
0
21 Nov 2022
Cross-Modal Contrastive Learning for Robust Reasoning in VQA
Qinjie Zheng
Chaoyue Wang
Daqing Liu
Dadong Wang
Dacheng Tao
LRM
32
0
0
21 Nov 2022
Explanation on Pretraining Bias of Finetuned Vision Transformer
Bumjin Park
Jaesik Choi
ViT
36
1
0
18 Nov 2022
Self-Supervised Visual Representation Learning via Residual Momentum
T. Pham
Axi Niu
Zhang Kang
Sultan Rizky Hikmawan Madjid
Jiajing Hong
Daehyeok Kim
Joshua Tian Jin Tee
Chang D. Yoo
SSL
46
6
0
17 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
35
60
0
17 Nov 2022
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
ViT
MQ
29
1
0
17 Nov 2022
Prompt Tuning for Parameter-efficient Medical Image Segmentation
Marc Fischer
Alexander Bartler
Bin Yang
SSeg
24
18
0
16 Nov 2022
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
Tianhong Li
Huiwen Chang
Shlok Kumar Mishra
Han Zhang
Dina Katabi
Dilip Krishnan
41
152
0
16 Nov 2022
Masked Reconstruction Contrastive Learning with Information Bottleneck Principle
Ziwen Liu
Bonan li
Congying Han
Tiande Guo
Xuecheng Nie
SSL
34
2
0
15 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
87
679
0
14 Nov 2022
Previous
1
2
3
4
5
6
...
8
9
10
Next