Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.13180
Cited By
Prototype-guided Cross-task Knowledge Distillation for Large-scale Models
26 December 2022
Deng Li
Aming Wu
Yahong Han
Qingwen Tian
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prototype-guided Cross-task Knowledge Distillation for Large-scale Models"
42 / 42 papers shown
Title
Knowledge Distillation Using Hierarchical Self-Supervision Augmented Distribution
Chuanguang Yang
Zhulin An
Linhang Cai
Yongjun Xu
54
15
0
07 Sep 2021
Distilling a Powerful Student Model via Online Knowledge Distillation
Shaojie Li
Mingbao Lin
Yan Wang
Yongjian Wu
Yonghong Tian
Ling Shao
Rongrong Ji
FedML
54
47
0
26 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
213
21,051
0
25 Mar 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
233
6,657
0
23 Dec 2020
Transformer Interpretability Beyond Attention Visualization
Hila Chefer
Shir Gur
Lior Wolf
56
652
0
17 Dec 2020
Cross-Layer Distillation with Semantic Calibration
Defang Chen
Jian-Ping Mei
Yuan Zhang
Can Wang
Yan Feng
Chun-Yen Chen
FedML
60
294
0
06 Dec 2020
Channel-wise Knowledge Distillation for Dense Prediction
Changyong Shu
Yifan Liu
Jianfei Gao
Zheng Yan
Chunhua Shen
42
261
0
26 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
172
40,217
0
22 Oct 2020
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
129
4,993
0
08 Oct 2020
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
44
2,907
0
09 Jun 2020
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
230
12,847
0
26 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
78
496
0
01 May 2020
Designing Network Design Spaces
Ilija Radosavovic
Raj Prateek Kosaraju
Ross B. Girshick
Kaiming He
Piotr Dollár
GNN
68
1,672
0
30 Mar 2020
Prototype Rectification for Few-Shot Learning
Jinlu Liu
Liang Song
Yongqiang Qin
53
247
0
25 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
84
7,386
0
02 Oct 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
32
1,838
0
23 Sep 2019
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
92
833
0
25 Aug 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
97
1,657
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
183
2,467
0
20 Aug 2019
Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
Kaidi Cao
Colin Wei
Adrien Gaidon
Nikos Arechiga
Tengyu Ma
64
1,583
0
18 Jun 2019
Distilling Object Detectors with Fine-grained Feature Imitation
Tao Wang
Li-xin Yuan
Xiaopeng Zhang
Jiashi Feng
ObjD
17
379
0
09 Jun 2019
When Does Label Smoothing Help?
Rafael Müller
Simon Kornblith
Geoffrey E. Hinton
UQCV
112
1,931
0
06 Jun 2019
Variational Information Distillation for Knowledge Transfer
SungSoo Ahn
S. Hu
Andreas C. Damianou
Neil D. Lawrence
Zhenwen Dai
67
612
0
11 Apr 2019
Relational Knowledge Distillation
Wonpyo Park
Dongju Kim
Yan Lu
Minsu Cho
47
1,396
0
10 Apr 2019
Cross-Modal Self-Attention Network for Referring Image Segmentation
Linwei Ye
Mrigank Rochan
Zhi Liu
Yang Wang
EgoV
15
472
0
09 Apr 2019
A Comprehensive Overhaul of Feature Distillation
Byeongho Heo
Jeesoo Kim
Sangdoo Yun
Hyojin Park
Nojun Kwak
J. Choi
44
571
0
03 Apr 2019
Class-Balanced Loss Based on Effective Number of Samples
Huayu Chen
Menglin Jia
Nayeon Lee
Yang Song
Serge J. Belongie
142
2,253
0
16 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
808
93,936
0
11 Oct 2018
Learning Deep Representations with Probabilistic Knowledge Transfer
Nikolaos Passalis
Anastasios Tefas
42
407
0
28 Mar 2018
To prune, or not to prune: exploring the efficacy of pruning for model compression
Michael Zhu
Suyog Gupta
106
1,262
0
05 Oct 2017
Semantic Foggy Scene Understanding with Synthetic Data
Daniel Gehrig
Dengxin Dai
Luc Van Gool
65
1,094
0
25 Aug 2017
Deep Hashing Network for Unsupervised Domain Adaptation
Hemanth Venkateswara
José Eusébio
Shayok Chakraborty
S. Panchanathan
OOD
78
2,023
0
22 Jun 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
314
129,831
0
12 Jun 2017
Learning multiple visual domains with residual adapters
Sylvestre-Alvise Rebuffi
Hakan Bilen
Andrea Vedaldi
OOD
67
924
0
22 May 2017
Prototypical Networks for Few-shot Learning
Jake C. Snell
Kevin Swersky
R. Zemel
173
8,072
0
15 Mar 2017
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer
Sergey Zagoruyko
N. Komodakis
86
2,561
0
12 Dec 2016
The Cityscapes Dataset for Semantic Urban Scene Understanding
Marius Cordts
Mohamed Omran
Sebastian Ramos
Timo Rehfeld
Markus Enzweiler
Rodrigo Benenson
Uwe Franke
Stefan Roth
Bernt Schiele
522
11,540
0
06 Apr 2016
Quantized Convolutional Neural Networks for Mobile Devices
Jiaxiang Wu
Cong Leng
Yuhang Wang
Qinghao Hu
Jian Cheng
MQ
53
1,162
0
21 Dec 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
106
19,448
0
09 Mar 2015
FitNets: Hints for Thin Deep Nets
Adriana Romero
Nicolas Ballas
Samira Ebrahimi Kahou
Antoine Chassang
C. Gatta
Yoshua Bengio
FedML
199
3,862
0
19 Dec 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
163
43,290
0
01 May 2014
Do Deep Nets Really Need to be Deep?
Lei Jimmy Ba
R. Caruana
130
2,114
0
21 Dec 2013
1