Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.10972
Cited By
v1
v2
v3
v4 (latest)
ImageNet-21K Pretraining for the Masses
22 April 2021
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
Re-assign community
ArXiv (abs)
PDF
HTML
Github (765★)
Papers citing
"ImageNet-21K Pretraining for the Masses"
50 / 427 papers shown
Title
MaskSketch: Unpaired Structure-guided Masked Image Generation
D. Bashkirova
José Lezama
Kihyuk Sohn
Kate Saenko
Irfan Essa
DiffM
60
25
0
10 Feb 2023
Key Design Choices for Double-Transfer in Source-Free Unsupervised Domain Adaptation
Andrea Maracani
Raffaello Camoriano
Elisa Maiettini
Davide Talon
Lorenzo Rosasco
Lorenzo Natale
85
2
0
10 Feb 2023
Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection
Xilie Xu
Jingfeng Zhang
Feng Liu
Masashi Sugiyama
Mohan S. Kankanhalli
AAML
104
17
0
08 Feb 2023
Leaving Reality to Imagination: Robust Classification via Generated Datasets
Hritik Bansal
Aditya Grover
OOD
117
94
0
05 Feb 2023
Referential communication in heterogeneous communities of pre-trained visual deep networks
Matéo Mahaut
Francesca Franzon
Roberto Dessì
Marco Baroni
74
7
0
04 Feb 2023
Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification
Kanishk Jain
Shyamgopal Karthik
Vineet Gandhi
61
6
0
01 Feb 2023
POSTER++: A simpler and stronger facial expression recognition network
Jia-ju Mao
Rui Xu
Xuesong Yin
Yuan Chang
Binling Nie
Aibin Huang
CVBM
85
37
0
28 Jan 2023
Zorro: the masked multimodal transformer
Adrià Recasens
Jason Lin
João Carreira
Drew Jaegle
Luyu Wang
...
Pauline Luc
Antoine Miech
Lucas Smaira
Ross Hemsley
Andrew Zisserman
92
21
0
23 Jan 2023
Image Memorability Prediction with Vision Transformers
Thomas Hagen
T. Espeseth
ViT
55
8
0
20 Jan 2023
Open-Set Likelihood Maximization for Few-Shot Learning
Malik Boudiaf
Etienne Bennequin
Myriam Tami
Antoine Toubhans
Pablo Piantanida
C´eline Hudelot
Ismail Ben Ayed
BDL
125
10
0
20 Jan 2023
SwinDepth: Unsupervised Depth Estimation using Monocular Sequences via Swin Transformer and Densely Cascaded Network
D. Shim
H. J. Kim
ViT
MDE
110
24
0
17 Jan 2023
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLM
AI4CE
LRM
111
17
0
12 Jan 2023
Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching
Byoungjip Kim
Sun Choi
Dasol Hwang
Moontae Lee
Honglak Lee
71
11
0
07 Jan 2023
Learning Trajectory-Word Alignments for Video-Language Tasks
Xu Yang
Zhang Li
Haiyang Xu
Hanwang Zhang
Qinghao Ye
Chenliang Li
Ming Yan
Yu Zhang
Fei Huang
Songfang Huang
80
7
0
05 Jan 2023
HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Qinghao Ye
Guohai Xu
Ming Yan
Haiyang Xu
Qi Qian
Ji Zhang
Fei Huang
VLM
AI4TS
220
75
0
30 Dec 2022
Robust Meta-Representation Learning via Global Label Inference and Classification
Ruohan Wang
Isak Falk
Massimiliano Pontil
C. Ciliberto
106
3
0
22 Dec 2022
Reversible Column Networks
Yuxuan Cai
Yi Zhou
Qi Han
Jianjian Sun
Xiangwen Kong
Jun Yu Li
Xiangyu Zhang
VLM
92
59
0
22 Dec 2022
Masked Event Modeling: Self-Supervised Pretraining for Event Cameras
Simone Klenk
David Bonello
Lukas Koestler
Nikita Araslanov
Daniel Cremers
92
25
0
20 Dec 2022
How to Train an Accurate and Efficient Object Detection Model on Any Dataset
Galina Zalesskaya
B. Bylicka
Eugene Liu
3DH
80
4
0
30 Nov 2022
Receptive Field Refinement for Convolutional Neural Networks Reliably Improves Predictive Performance
Mats L. Richter
C. Pal
70
3
0
26 Nov 2022
Expanding Small-Scale Datasets with Guided Imagination
Yifan Zhang
Daquan Zhou
Bryan Hooi
Kaixin Wang
Jiashi Feng
171
48
0
25 Nov 2022
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones
Yulin Wang
Yang Yue
Rui Lu
Tian-De Liu
Zhaobai Zhong
S. Song
Gao Huang
90
29
0
17 Nov 2022
Joint Deep Learning for Improved Myocardial Scar Detection from Cardiac MRI
Jiarui Xing
Shuo Wang
K. Bilchick
Amit R. Patel
Miaomiao Zhang
18
2
0
11 Nov 2022
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining
Qiang Chen
Jian Wang
Chuchu Han
Shangang Zhang
Zexian Li
...
Haocheng Feng
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
ViT
VLM
92
45
0
07 Nov 2022
On the Informativeness of Supervision Signals
Ilia Sucholutsky
Ruairidh M. Battleday
Katherine M. Collins
Raja Marjieh
Joshua C. Peterson
Pulkit Singh
Umang Bhatt
Nori Jacoby
Adrian Weller
Thomas Griffiths
96
13
0
02 Nov 2022
Fully-attentive and interpretable: vision and video vision transformers for pain detection
Giacomo Fiorentini
Itir Onal Ertugrul
A. A. Salah
MedIm
ViT
86
2
0
27 Oct 2022
ProContEXT: Exploring Progressive Context Transformer for Tracking
Jinpeng Lan
Zhi-Qi Cheng
Ju He
Chenyang Li
Bin Luo
Xueting Bao
Wangmeng Xiang
Yifeng Geng
Xuansong Xie
102
31
0
27 Oct 2022
The Robustness Limits of SoTA Vision Models to Natural Variation
Mark Ibrahim
Q. Garrido
Ari S. Morcos
Diane Bouchacourt
VLM
99
16
0
24 Oct 2022
Deep Model Reassembly
Xingyi Yang
Zhou Daquan
Songhua Liu
Jingwen Ye
Xinchao Wang
MoMe
98
129
0
24 Oct 2022
Anomaly Detection Requires Better Representations
Tal Reiss
Niv Cohen
Eliahu Horwitz
Ron Abutbul
Yedid Hoshen
OOD
AI4TS
SSL
129
21
0
19 Oct 2022
A Simple Baseline that Questions the Use of Pretrained-Models in Continual Learning
Paul Janson
Wenxuan Zhang
Rahaf Aljundi
Mohamed Elhoseiny
VLM
SSL
CLL
78
52
0
10 Oct 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&Ro
DiffM
236
148
0
05 Oct 2022
Medical Image Retrieval via Nearest Neighbor Search on Pre-trained Image Features
Deepa Gupta
R. Loane
Soumya Gayen
Dina Demner-Fushman
MedIm
60
7
0
05 Oct 2022
ASIF: Coupled Data Turns Unimodal Models to Multimodal Without Training
Antonio Norelli
Marco Fumero
Valentino Maiorca
Luca Moschella
Emanuele Rodolà
Francesco Locatello
VLM
166
36
0
04 Oct 2022
Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input
Zilun Zhang
Farzad Khalvati
MedIm
ViT
47
10
0
03 Oct 2022
Early or Late Fusion Matters: Efficient RGB-D Fusion in Vision Transformers for 3D Object Recognition
Georgios Tziafas
Hamidreza Kasaei
ViT
73
12
0
03 Oct 2022
Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods
Skanda Koppula
Yazhe Li
Evan Shelhamer
Andrew Jaegle
Nikhil Parthasarathy
Relja Arandjelović
João Carreira
Olivier J. Hénaff
84
9
0
30 Sep 2022
Leveraging Self-Supervised Training for Unintentional Action Recognition
Enea Duka
Anna Kukleva
Bernt Schiele
69
1
0
23 Sep 2022
Top-Tuning: a study on transfer learning for an efficient alternative to fine tuning for image classification with fast kernel methods
P. D. Alfano
Vito Paolo Pastore
Lorenzo Rosasco
Francesca Odone
49
7
0
16 Sep 2022
OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network
Tiancheng Zhao
Peng Liu
Kyusong Lee
VLM
MLLM
ObjD
42
5
0
10 Sep 2022
Fine-grain Inference on Out-of-Distribution Data with Hierarchical Classification
Randolph Linderman
Jingyang Zhang
Nathan Inkawhich
H. Li
Yiran Chen
OODD
175
7
0
09 Sep 2022
Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond
Cheng-Yen Hsieh
Chih-Jung Chang
Fu-En Yang
Yu-Chiang Frank Wang
SSL
81
8
0
30 Aug 2022
Synthetic Latent Fingerprint Generator
André Brasil Vieira Wyzykowski
A.K. Jain
58
13
0
29 Aug 2022
Prompt-Matched Semantic Segmentation
Lingbo Liu
Jianlong Chang
Bruce X. B. Yu
Liang Lin
Qi Tian
Changrui Chen
VPVLM
VLM
111
29
0
22 Aug 2022
Open Vocabulary Multi-Label Classification with Dual-Modal Decoder on Aligned Visual-Textual Features
Shichao Xu
Yikang Li
Jenhao Hsiao
C. Ho
Zhuang Qi
67
8
0
19 Aug 2022
Abutting Grating Illusion: Cognitive Challenge to Neural Network Models
Jinyu Fan
Yi Zeng
AAML
58
1
0
08 Aug 2022
GPPF: A General Perception Pre-training Framework via Sparsely Activated Multi-Task Learning
Benyuan Sun
Jinqiao Dai
Zihao Liang
Cong Liu
Yi Yang
Bo Bai
MoE
75
4
0
03 Aug 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
115
256
0
28 Jul 2022
An Impartial Take to the CNN vs Transformer Robustness Contest
Francesco Pinto
Philip Torr
P. Dokania
UQCV
AAML
100
49
0
22 Jul 2022
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
Kan Wu
Jinnian Zhang
Houwen Peng
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
79
267
0
21 Jul 2022
Previous
1
2
3
4
5
6
7
8
9
Next