Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.07940
Cited By
Meta Knowledge Distillation
16 February 2022
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Meta Knowledge Distillation"
35 / 35 papers shown
Title
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
451
7,739
0
11 Nov 2021
Isotonic Data Augmentation for Knowledge Distillation
Wanyun Cui
Sen Yan
54
7
0
03 Jul 2021
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
Zhiwei Hao
Jianyuan Guo
Ding Jia
Kai Han
Yehui Tang
Chao Zhang
Dacheng Tao
Yunhe Wang
ViT
76
71
0
03 Jul 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
107
632
0
18 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
258
2,824
0
15 Jun 2021
Knowledge distillation: A good teacher is patient and consistent
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
VLM
107
295
0
09 Jun 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
668
6,066
0
29 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
154
1,862
0
05 Apr 2021
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study
Zhiqiang Shen
Zechun Liu
Dejia Xu
Zitian Chen
Kwang-Ting Cheng
Marios Savvides
43
76
0
01 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
152
1,014
0
31 Mar 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
377
6,762
0
23 Dec 2020
Meta-Learning with Adaptive Hyperparameters
Sungyong Baik
Myungsub Choi
Janghoon Choi
Heewon Kim
Kyoung Mu Lee
66
125
0
31 Oct 2020
An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation
Deepan Das
Haley Massa
Abhimanyu Kulkarni
Theodoros Rekatsinas
46
18
0
06 Jun 2020
Meta Pseudo Labels
Hieu H. Pham
Zihang Dai
Qizhe Xie
Minh-Thang Luong
Quoc V. Le
VLM
335
669
0
23 Mar 2020
Contrastive Representation Distillation
Yonglong Tian
Dilip Krishnan
Phillip Isola
144
1,049
0
23 Oct 2019
On the Efficacy of Knowledge Distillation
Ligang He
Rui Mao
94
609
0
03 Oct 2019
RandAugment: Practical automated data augmentation with a reduced search space
E. D. Cubuk
Barret Zoph
Jonathon Shlens
Quoc V. Le
MQ
221
3,485
0
30 Sep 2019
Adaptive Gradient-Based Meta-Learning Methods
M. Khodak
Maria-Florina Balcan
Ameet Talwalkar
FedML
83
355
0
06 Jun 2019
When Does Label Smoothing Help?
Rafael Müller
Simon Kornblith
Geoffrey E. Hinton
UQCV
191
1,945
0
06 Jun 2019
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
609
4,778
0
13 May 2019
Variational Information Distillation for Knowledge Transfer
SungSoo Ahn
S. Hu
Andreas C. Damianou
Neil D. Lawrence
Zhenwen Dai
89
620
0
11 Apr 2019
Relational Knowledge Distillation
Wonpyo Park
Dongju Kim
Yan Lu
Minsu Cho
65
1,412
0
10 Apr 2019
Improved Knowledge Distillation via Teacher Assistant
Seyed Iman Mirzadeh
Mehrdad Farajtabar
Ang Li
Nir Levine
Akihiro Matsukawa
H. Ghasemzadeh
92
1,075
0
09 Feb 2019
Meta-Gradient Reinforcement Learning
Zhongwen Xu
H. V. Hasselt
David Silver
107
324
0
24 May 2018
Paraphrasing Complex Network: Network Compression via Factor Transfer
Jangho Kim
Seonguk Park
Nojun Kwak
66
550
0
14 Feb 2018
mixup: Beyond Empirical Risk Minimization
Hongyi Zhang
Moustapha Cissé
Yann N. Dauphin
David Lopez-Paz
NoLa
276
9,760
0
25 Oct 2017
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Zhenguo Li
Fengwei Zhou
Fei Chen
Hang Li
94
1,119
0
31 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
687
131,526
0
12 Jun 2017
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer
Sergey Zagoruyko
N. Komodakis
118
2,579
0
12 Dec 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
410
10,482
0
21 Jul 2016
Deep Networks with Stochastic Depth
Gao Huang
Yu Sun
Zhuang Liu
Daniel Sedra
Kilian Q. Weinberger
212
2,356
0
30 Mar 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
193,878
0
10 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
878
27,358
0
02 Dec 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
344
19,643
0
09 Mar 2015
FitNets: Hints for Thin Deep Nets
Adriana Romero
Nicolas Ballas
Samira Ebrahimi Kahou
Antoine Chassang
C. Gatta
Yoshua Bengio
FedML
300
3,883
0
19 Dec 2014
1