Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.05237
Cited By
v1
v2 (latest)
Knowledge distillation: A good teacher is patient and consistent
9 June 2021
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Knowledge distillation: A good teacher is patient and consistent"
49 / 199 papers shown
Title
NeRN -- Learning Neural Representations for Neural Networks
Maor Ashkenazi
Zohar Rimon
Ron Vainshtein
Shir Levi
Elad Richardson
Pinchas Mintz
Eran Treister
3DH
92
13
0
27 Dec 2022
Joint Embedding of 2D and 3D Networks for Medical Image Anomaly Detection
In-Joo Kang
Jinah Park
3DH
55
1
0
21 Dec 2022
FlexiViT: One Model for All Patch Sizes
Lucas Beyer
Pavel Izmailov
Alexander Kolesnikov
Mathilde Caron
Simon Kornblith
Xiaohua Zhai
Matthias Minderer
Michael Tschannen
Ibrahim Alabdulmohsin
Filip Pavetić
VLM
155
94
0
15 Dec 2022
Progressive Learning without Forgetting
Tao Feng
Hangjie Yuan
Mang Wang
Ziyuan Huang
Ang Bian
Jianzhou Zhang
CLL
KELM
98
4
0
28 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo
Joseph Bethge
Christoph Meinel
Haojin Yang
MQ
117
20
0
23 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
140
60
0
17 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
99
88
0
17 Nov 2022
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Yu Wang
Xin Li
Shengzhao Wen
Fu-En Yang
Wanping Zhang
Gang Zhang
Haocheng Feng
Junyu Han
110
5
0
15 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection
Linfeng Zhang
Yukang Shi
Hung-Shuo Tai
Zhipeng Zhang
Yuan He
Ke Wang
Kaisheng Ma
78
2
0
14 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
86
60
0
09 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
85
5
0
01 Nov 2022
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
Jie Chen
Shouzhen Chen
Mingyuan Bai
Junbin Gao
Junping Zhang
Jian Pu
90
10
0
18 Oct 2022
Semantic Segmentation with Active Semi-Supervised Representation Learning
Aneesh Rangnekar
Christopher Kanan
Matthew Hoffman
62
5
0
16 Oct 2022
Knowledge Distillation approach towards Melanoma Detection
Md Shakib Khan
Kazi Nabiul Alam
Abdur Rab Dhruba
H. Zunair
Nabeel Mohammed
65
24
0
14 Oct 2022
Students taught by multimodal teachers are superior action recognizers
Gorjan Radevski
Dusan Grujicic
Matthew Blaschko
Marie-Francine Moens
Tinne Tuytelaars
73
1
0
09 Oct 2022
Robust Active Distillation
Cenk Baykal
Khoa Trinh
Fotis Iliopoulos
Gaurav Menghani
Erik Vee
88
11
0
03 Oct 2022
Global Semantic Descriptors for Zero-Shot Action Recognition
Valter Estevam
Rayson Laroca
Hélio Pedrini
David Menotti
92
3
0
24 Sep 2022
TeST: Test-time Self-Training under Distribution Shift
Samarth Sinha
Peter V. Gehler
Francesco Locatello
Bernt Schiele
TTA
OOD
116
25
0
23 Sep 2022
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation
Ehsan Amid
Rohan Anil
Christopher Fifty
Manfred K. Warmuth
62
2
0
15 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
235
111
0
13 Sep 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Rohan Taori
Tatsunori B. Hashimoto
135
48
0
08 Sep 2022
Effectiveness of Function Matching in Driving Scene Recognition
Shingo Yashima
48
1
0
20 Aug 2022
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks using cGANs
Sameer Ambekar
Matteo Tafuro
Ankit Ankit
Diego van der Mast
Mark Alence
C. Athanasiadis
GAN
71
4
0
08 Aug 2022
Efficient One Pass Self-distillation with Zipf's Label Smoothing
Jiajun Liang
Linze Li
Z. Bing
Borui Zhao
Yao Tang
Bo Lin
Haoqiang Fan
53
19
0
26 Jul 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Nathan Ng
Neha Hulkund
Kyunghyun Cho
Marzyeh Ghassemi
OOD
63
5
0
05 Jul 2022
What Knowledge Gets Distilled in Knowledge Distillation?
Utkarsh Ojha
Yuheng Li
Anirudh Sundara Rajan
Yingyu Liang
Yong Jae Lee
FedML
85
21
0
31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Leandro M. de Lima
R. Krohling
ViT
MedIm
72
11
0
30 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian Sun
Weiming Hu
ViT
152
46
0
28 May 2022
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges
Zhenghua Chen
Min-man Wu
Alvin Chan
Xiaoli Li
Yew-Soon Ong
59
7
0
08 May 2022
Merging of neural networks
Martin Pasen
Vladimír Boza
FedML
MoMe
78
2
0
21 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
107
11
0
07 Apr 2022
Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes
Samrudhdhi B. Rangrej
C. Srinidhi
J. Clark
72
12
0
01 Apr 2022
On the benefits of knowledge distillation for adversarial robustness
Javier Maroto
Guillermo Ortiz-Jiménez
P. Frossard
AAML
FedML
74
20
0
14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
73
29
0
13 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
205
1,013
1
10 Mar 2022
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Weixin Liang
Yuhui Zhang
Yongchan Kwon
Serena Yeung
James Zou
VLM
149
430
0
03 Mar 2022
Meta Knowledge Distillation
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
85
26
0
16 Feb 2022
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing
Emanuel Ben-Baruch
M. Karklinsky
Yossi Biton
Avi Ben-Cohen
Hussam Lawen
Nadav Zamir
58
12
0
18 Jan 2022
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation
K. Navaneet
Soroush Abbasi Koohpayegani
Ajinkya Tejankar
Hamed Pirsiavash
63
20
0
13 Jan 2022
Microdosing: Knowledge Distillation for GAN based Compression
Leonhard Helminger
Roberto Azevedo
Abdelaziz Djelouah
Markus Gross
Christopher Schroers
49
3
0
07 Jan 2022
Ex-Model: Continual Learning from a Stream of Trained Models
Antonio Carta
Andrea Cossu
Vincenzo Lomonaco
D. Bacciu
CLL
50
12
0
13 Dec 2021
A Fast Knowledge Distillation Framework for Visual Recognition
Zhiqiang Shen
Eric P. Xing
VLM
112
50
0
02 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
Yuki M. Asano
Aaqib Saeed
92
7
0
01 Dec 2021
PP-ShiTu: A Practical Lightweight Image Recognition System
Shengyun Wei
Ruoyu Guo
Cheng Cui
Bin Lu
Shuilong Dong
...
Xueying Lyu
Qiwen Liu
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
CVBM
144
6
0
01 Nov 2021
Network Augmentation for Tiny Deep Learning
Han Cai
Chuang Gan
Ji Lin
Song Han
133
30
0
17 Oct 2021
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR
Khoi Duc Minh Nguyen
Y. Nguyen
Bao Le
57
5
0
02 Aug 2021
Teacher's pet: understanding and mitigating biases in distillation
Michal Lukasik
Srinadh Bhojanapalli
A. Menon
Sanjiv Kumar
80
25
0
19 Jun 2021
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
77
225
0
10 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Fahad Shahbaz Khan
Fatih Porikli
ViT
103
95
0
08 Jun 2021
Previous
1
2
3
4