ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.07940
  4. Cited By
Meta Knowledge Distillation

Meta Knowledge Distillation

16 February 2022
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
ArXivPDFHTML

Papers citing "Meta Knowledge Distillation"

35 / 35 papers shown
Title
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
454
7,739
0
11 Nov 2021
Isotonic Data Augmentation for Knowledge Distillation
Isotonic Data Augmentation for Knowledge Distillation
Wanyun Cui
Sen Yan
54
7
0
03 Jul 2021
Learning Efficient Vision Transformers via Fine-Grained Manifold
  Distillation
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
Zhiwei Hao
Jianyuan Guo
Ding Jia
Kai Han
Yehui Tang
Chao Zhang
Dacheng Tao
Yunhe Wang
ViT
76
71
0
03 Jul 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision
  Transformers
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
107
632
0
18 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
258
2,824
0
15 Jun 2021
Knowledge distillation: A good teacher is patient and consistent
Knowledge distillation: A good teacher is patient and consistent
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
VLM
107
295
0
09 Jun 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
674
6,066
0
29 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
154
1,862
0
05 Apr 2021
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An
  Empirical Study
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study
Zhiqiang Shen
Zechun Liu
Dejia Xu
Zitian Chen
Kwang-Ting Cheng
Marios Savvides
43
76
0
01 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
154
1,014
0
31 Mar 2021
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
377
6,762
0
23 Dec 2020
Meta-Learning with Adaptive Hyperparameters
Meta-Learning with Adaptive Hyperparameters
Sungyong Baik
Myungsub Choi
Janghoon Choi
Heewon Kim
Kyoung Mu Lee
66
125
0
31 Oct 2020
An Empirical Analysis of the Impact of Data Augmentation on Knowledge
  Distillation
An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation
Deepan Das
Haley Massa
Abhimanyu Kulkarni
Theodoros Rekatsinas
46
18
0
06 Jun 2020
Meta Pseudo Labels
Meta Pseudo Labels
Hieu H. Pham
Zihang Dai
Qizhe Xie
Minh-Thang Luong
Quoc V. Le
VLM
335
669
0
23 Mar 2020
Contrastive Representation Distillation
Contrastive Representation Distillation
Yonglong Tian
Dilip Krishnan
Phillip Isola
144
1,049
0
23 Oct 2019
On the Efficacy of Knowledge Distillation
On the Efficacy of Knowledge Distillation
Ligang He
Rui Mao
94
609
0
03 Oct 2019
RandAugment: Practical automated data augmentation with a reduced search
  space
RandAugment: Practical automated data augmentation with a reduced search space
E. D. Cubuk
Barret Zoph
Jonathon Shlens
Quoc V. Le
MQ
221
3,485
0
30 Sep 2019
Adaptive Gradient-Based Meta-Learning Methods
Adaptive Gradient-Based Meta-Learning Methods
M. Khodak
Maria-Florina Balcan
Ameet Talwalkar
FedML
83
355
0
06 Jun 2019
When Does Label Smoothing Help?
When Does Label Smoothing Help?
Rafael Müller
Simon Kornblith
Geoffrey E. Hinton
UQCV
191
1,945
0
06 Jun 2019
CutMix: Regularization Strategy to Train Strong Classifiers with
  Localizable Features
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
609
4,778
0
13 May 2019
Variational Information Distillation for Knowledge Transfer
Variational Information Distillation for Knowledge Transfer
SungSoo Ahn
S. Hu
Andreas C. Damianou
Neil D. Lawrence
Zhenwen Dai
89
620
0
11 Apr 2019
Relational Knowledge Distillation
Relational Knowledge Distillation
Wonpyo Park
Dongju Kim
Yan Lu
Minsu Cho
65
1,412
0
10 Apr 2019
Improved Knowledge Distillation via Teacher Assistant
Improved Knowledge Distillation via Teacher Assistant
Seyed Iman Mirzadeh
Mehrdad Farajtabar
Ang Li
Nir Levine
Akihiro Matsukawa
H. Ghasemzadeh
92
1,075
0
09 Feb 2019
Meta-Gradient Reinforcement Learning
Meta-Gradient Reinforcement Learning
Zhongwen Xu
H. V. Hasselt
David Silver
107
324
0
24 May 2018
Paraphrasing Complex Network: Network Compression via Factor Transfer
Paraphrasing Complex Network: Network Compression via Factor Transfer
Jangho Kim
Seonguk Park
Nojun Kwak
69
550
0
14 Feb 2018
mixup: Beyond Empirical Risk Minimization
mixup: Beyond Empirical Risk Minimization
Hongyi Zhang
Moustapha Cissé
Yann N. Dauphin
David Lopez-Paz
NoLa
278
9,760
0
25 Oct 2017
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Zhenguo Li
Fengwei Zhou
Fei Chen
Hang Li
94
1,119
0
31 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
687
131,526
0
12 Jun 2017
Paying More Attention to Attention: Improving the Performance of
  Convolutional Neural Networks via Attention Transfer
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer
Sergey Zagoruyko
N. Komodakis
118
2,579
0
12 Dec 2016
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
410
10,482
0
21 Jul 2016
Deep Networks with Stochastic Depth
Deep Networks with Stochastic Depth
Gao Huang
Yu Sun
Zhuang Liu
Daniel Sedra
Kilian Q. Weinberger
212
2,356
0
30 Mar 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
193,878
0
10 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
878
27,358
0
02 Dec 2015
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
344
19,643
0
09 Mar 2015
FitNets: Hints for Thin Deep Nets
FitNets: Hints for Thin Deep Nets
Adriana Romero
Nicolas Ballas
Samira Ebrahimi Kahou
Antoine Chassang
C. Gatta
Yoshua Bengio
FedML
303
3,883
0
19 Dec 2014
1