ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.05237
  4. Cited By
Knowledge distillation: A good teacher is patient and consistent
v1v2 (latest)

Knowledge distillation: A good teacher is patient and consistent

9 June 2021
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
    VLM
ArXiv (abs)PDFHTML

Papers citing "Knowledge distillation: A good teacher is patient and consistent"

49 / 199 papers shown
Title
NeRN -- Learning Neural Representations for Neural Networks
NeRN -- Learning Neural Representations for Neural Networks
Maor Ashkenazi
Zohar Rimon
Ron Vainshtein
Shir Levi
Elad Richardson
Pinchas Mintz
Eran Treister
3DH
92
13
0
27 Dec 2022
Joint Embedding of 2D and 3D Networks for Medical Image Anomaly
  Detection
Joint Embedding of 2D and 3D Networks for Medical Image Anomaly Detection
In-Joo Kang
Jinah Park
3DH
55
1
0
21 Dec 2022
FlexiViT: One Model for All Patch Sizes
FlexiViT: One Model for All Patch Sizes
Lucas Beyer
Pavel Izmailov
Alexander Kolesnikov
Mathilde Caron
Simon Kornblith
Xiaohua Zhai
Matthias Minderer
Michael Tschannen
Ibrahim Alabdulmohsin
Filip Pavetić
VLM
155
94
0
15 Dec 2022
Progressive Learning without Forgetting
Progressive Learning without Forgetting
Tao Feng
Hangjie Yuan
Mang Wang
Ziyuan Huang
Ang Bian
Jianzhou Zhang
CLLKELM
98
4
0
28 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network
  Ticket
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo
Joseph Bethge
Christoph Meinel
Haojin Yang
MQ
117
20
0
23 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
140
60
0
17 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
99
88
0
17 Nov 2022
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Yu Wang
Xin Li
Shengzhao Wen
Fu-En Yang
Wanping Zhang
Gang Zhang
Haocheng Feng
Junyu Han
110
5
0
15 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact
  Multi-View 3D Detection
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection
Linfeng Zhang
Yukang Shi
Hung-Shuo Tai
Zhipeng Zhang
Yuan He
Ke Wang
Kaisheng Ma
78
2
0
14 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge
  Distillation
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
86
60
0
09 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
85
5
0
01 Nov 2022
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
Jie Chen
Shouzhen Chen
Mingyuan Bai
Junbin Gao
Junping Zhang
Jian Pu
90
10
0
18 Oct 2022
Semantic Segmentation with Active Semi-Supervised Representation
  Learning
Semantic Segmentation with Active Semi-Supervised Representation Learning
Aneesh Rangnekar
Christopher Kanan
Matthew Hoffman
62
5
0
16 Oct 2022
Knowledge Distillation approach towards Melanoma Detection
Knowledge Distillation approach towards Melanoma Detection
Md Shakib Khan
Kazi Nabiul Alam
Abdur Rab Dhruba
H. Zunair
Nabeel Mohammed
65
24
0
14 Oct 2022
Students taught by multimodal teachers are superior action recognizers
Students taught by multimodal teachers are superior action recognizers
Gorjan Radevski
Dusan Grujicic
Matthew Blaschko
Marie-Francine Moens
Tinne Tuytelaars
73
1
0
09 Oct 2022
Robust Active Distillation
Robust Active Distillation
Cenk Baykal
Khoa Trinh
Fotis Iliopoulos
Gaurav Menghani
Erik Vee
88
11
0
03 Oct 2022
Global Semantic Descriptors for Zero-Shot Action Recognition
Global Semantic Descriptors for Zero-Shot Action Recognition
Valter Estevam
Rayson Laroca
Hélio Pedrini
David Menotti
92
3
0
24 Sep 2022
TeST: Test-time Self-Training under Distribution Shift
TeST: Test-time Self-Training under Distribution Shift
Samarth Sinha
Peter V. Gehler
Francesco Locatello
Bernt Schiele
TTAOOD
116
25
0
23 Sep 2022
Layerwise Bregman Representation Learning with Applications to Knowledge
  Distillation
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation
Ehsan Amid
Rohan Anil
Christopher Fifty
Manfred K. Warmuth
62
2
0
15 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
235
111
0
13 Sep 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Rohan Taori
Tatsunori B. Hashimoto
135
48
0
08 Sep 2022
Effectiveness of Function Matching in Driving Scene Recognition
Effectiveness of Function Matching in Driving Scene Recognition
Shingo Yashima
48
1
0
20 Aug 2022
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative
  Networks using cGANs
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks using cGANs
Sameer Ambekar
Matteo Tafuro
Ankit Ankit
Diego van der Mast
Mark Alence
C. Athanasiadis
GAN
71
4
0
08 Aug 2022
Efficient One Pass Self-distillation with Zipf's Label Smoothing
Efficient One Pass Self-distillation with Zipf's Label Smoothing
Jiajun Liang
Linze Li
Z. Bing
Borui Zhao
Yao Tang
Bo Lin
Haoqiang Fan
53
19
0
26 Jul 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Nathan Ng
Neha Hulkund
Kyunghyun Cho
Marzyeh Ghassemi
OOD
63
5
0
05 Jul 2022
What Knowledge Gets Distilled in Knowledge Distillation?
What Knowledge Gets Distilled in Knowledge Distillation?
Utkarsh Ojha
Yuheng Li
Anirudh Sundara Rajan
Yingyu Liang
Yong Jae Lee
FedML
85
21
0
31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on
  Small Datasets
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Leandro M. de Lima
R. Krohling
ViTMedIm
72
11
0
30 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian Sun
Weiming Hu
ViT
152
46
0
28 May 2022
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms
  and Research Challenges
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges
Zhenghua Chen
Min-man Wu
Alvin Chan
Xiaoli Li
Yew-Soon Ong
59
7
0
08 May 2022
Merging of neural networks
Merging of neural networks
Martin Pasen
Vladimír Boza
FedMLMoMe
78
2
0
21 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top
  Results
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
107
11
0
07 Apr 2022
Consistency driven Sequential Transformers Attention Model for Partially
  Observable Scenes
Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes
Samrudhdhi B. Rangrej
C. Srinidhi
J. Clark
72
12
0
01 Apr 2022
On the benefits of knowledge distillation for adversarial robustness
On the benefits of knowledge distillation for adversarial robustness
Javier Maroto
Guillermo Ortiz-Jiménez
P. Frossard
AAMLFedML
74
20
0
14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio
  Classification
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
73
29
0
13 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
205
1,013
1
10 Mar 2022
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive
  Representation Learning
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Weixin Liang
Yuhui Zhang
Yongchan Kwon
Serena Yeung
James Zou
VLM
149
430
0
03 Mar 2022
Meta Knowledge Distillation
Meta Knowledge Distillation
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
85
26
0
16 Feb 2022
It's All in the Head: Representation Knowledge Distillation through
  Classifier Sharing
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing
Emanuel Ben-Baruch
M. Karklinsky
Yossi Biton
Avi Ben-Cohen
Hussam Lawen
Nadav Zamir
58
12
0
18 Jan 2022
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised
  Knowledge Distillation
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation
K. Navaneet
Soroush Abbasi Koohpayegani
Ajinkya Tejankar
Hamed Pirsiavash
63
20
0
13 Jan 2022
Microdosing: Knowledge Distillation for GAN based Compression
Microdosing: Knowledge Distillation for GAN based Compression
Leonhard Helminger
Roberto Azevedo
Abdelaziz Djelouah
Markus Gross
Christopher Schroers
49
3
0
07 Jan 2022
Ex-Model: Continual Learning from a Stream of Trained Models
Ex-Model: Continual Learning from a Stream of Trained Models
Antonio Carta
Andrea Cossu
Vincenzo Lomonaco
D. Bacciu
CLL
50
12
0
13 Dec 2021
A Fast Knowledge Distillation Framework for Visual Recognition
A Fast Knowledge Distillation Framework for Visual Recognition
Zhiqiang Shen
Eric P. Xing
VLM
112
50
0
02 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from
  a Single Image
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
Yuki M. Asano
Aaqib Saeed
92
7
0
01 Dec 2021
PP-ShiTu: A Practical Lightweight Image Recognition System
PP-ShiTu: A Practical Lightweight Image Recognition System
Shengyun Wei
Ruoyu Guo
Cheng Cui
Bin Lu
Shuilong Dong
...
Xueying Lyu
Qiwen Liu
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
CVBM
144
6
0
01 Nov 2021
Network Augmentation for Tiny Deep Learning
Network Augmentation for Tiny Deep Learning
Han Cai
Chuang Gan
Ji Lin
Song Han
133
30
0
17 Oct 2021
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation
  with SimCLR
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR
Khoi Duc Minh Nguyen
Y. Nguyen
Bao Le
57
5
0
02 Aug 2021
Teacher's pet: understanding and mitigating biases in distillation
Teacher's pet: understanding and mitigating biases in distillation
Michal Lukasik
Srinadh Bhojanapalli
A. Menon
Sanjiv Kumar
80
25
0
19 Jun 2021
Does Knowledge Distillation Really Work?
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
77
225
0
10 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
On Improving Adversarial Transferability of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Fahad Shahbaz Khan
Fatih Porikli
ViT
103
95
0
08 Jun 2021
Previous
1234