v1v2 (latest)

Knowledge distillation: A good teacher is patient and consistent

9 June 2021

Papers citing "Knowledge distillation: A good teacher is patient and consistent"

49 / 199 papers shown

Title
NeRN -- Learning Neural Representations for Neural Networks Maor Ashkenazi Zohar Rimon Ron Vainshtein Shir Levi Elad Richardson Pinchas Mintz Eran Treister 3DH 92 13 0 27 Dec 2022
Joint Embedding of 2D and 3D Networks for Medical Image Anomaly Detection In-Joo Kang Jinah Park 3DH 55 1 0 21 Dec 2022
FlexiViT: One Model for All Patch Sizes Lucas Beyer Pavel Izmailov Alexander Kolesnikov Mathilde Caron Simon Kornblith Xiaohua Zhai Matthias Minderer Michael Tschannen Ibrahim Alabdulmohsin Filip Pavetić VLM 155 94 0 15 Dec 2022
Progressive Learning without Forgetting Tao Feng Hangjie Yuan Mang Wang Ziyuan Huang Ang Bian Jianzhou Zhang CLL KELM 98 4 0 28 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket Nianhui Guo Joseph Bethge Christoph Meinel Haojin Yang MQ 117 20 0 23 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up Luke Metz James Harrison C. Freeman Amil Merchant Lucas Beyer ... Naman Agrawal Ben Poole Igor Mordatch Adam Roberts Jascha Narain Sohl-Dickstein 140 60 0 17 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding Shizhe Chen Pierre-Louis Guhur Makarand Tapaswi Cordelia Schmid Ivan Laptev 99 88 0 17 Nov 2022
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling Yu Wang Xin Li Shengzhao Wen Fu-En Yang Wanping Zhang Gang Zhang Haocheng Feng Junyu Han 110 5 0 15 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection Linfeng Zhang Yukang Shi Hung-Shuo Tai Zhipeng Zhang Yuan He Ke Wang Kaisheng Ma 78 2 0 14 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation Florian Schmid Khaled Koutini Gerhard Widmer ViT 86 60 0 09 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation Cody Blakeney Jessica Zosa Forde Jonathan Frankle Ziliang Zong Matthew L. Leavitt VLM 85 5 0 01 Nov 2022
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP Jie Chen Shouzhen Chen Mingyuan Bai Junbin Gao Junping Zhang Jian Pu 90 10 0 18 Oct 2022
Semantic Segmentation with Active Semi-Supervised Representation Learning Aneesh Rangnekar Christopher Kanan Matthew Hoffman 62 5 0 16 Oct 2022
Knowledge Distillation approach towards Melanoma Detection Md Shakib Khan Kazi Nabiul Alam Abdur Rab Dhruba H. Zunair Nabeel Mohammed 65 24 0 14 Oct 2022
Students taught by multimodal teachers are superior action recognizers Gorjan Radevski Dusan Grujicic Matthew Blaschko Marie-Francine Moens Tinne Tuytelaars 73 1 0 09 Oct 2022
Robust Active Distillation Cenk Baykal Khoa Trinh Fotis Iliopoulos Gaurav Menghani Erik Vee 88 11 0 03 Oct 2022
Global Semantic Descriptors for Zero-Shot Action Recognition Valter Estevam Rayson Laroca Hélio Pedrini David Menotti 92 3 0 24 Sep 2022
TeST: Test-time Self-Training under Distribution Shift Samarth Sinha Peter V. Gehler Francesco Locatello Bernt Schiele TTA OOD 116 25 0 23 Sep 2022
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation Ehsan Amid Rohan Anil Christopher Fifty Manfred K. Warmuth 62 2 0 15 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision Ibrahim Alabdulmohsin Behnam Neyshabur Xiaohua Zhai 235 111 0 13 Sep 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases Rohan Taori Tatsunori B. Hashimoto 135 48 0 08 Sep 2022
Effectiveness of Function Matching in Driving Scene Recognition Shingo Yashima 48 1 0 20 Aug 2022
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks using cGANs Sameer Ambekar Matteo Tafuro Ankit Ankit Diego van der Mast Mark Alence C. Athanasiadis GAN 71 4 0 08 Aug 2022
Efficient One Pass Self-distillation with Zipf's Label Smoothing Jiajun Liang Linze Li Z. Bing Borui Zhao Yao Tang Bo Lin Haoqiang Fan 53 19 0 26 Jul 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance Nathan Ng Neha Hulkund Kyunghyun Cho Marzyeh Ghassemi OOD 63 5 0 05 Jul 2022
What Knowledge Gets Distilled in Knowledge Distillation? Utkarsh Ojha Yuheng Li Anirudh Sundara Rajan Yingyu Liang Yong Jae Lee FedML 85 21 0 31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets Leandro M. de Lima R. Krohling ViT MedIm 72 11 0 30 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers Shaoru Wang Jin Gao Zeming Li Jian Sun Weiming Hu ViT 152 46 0 28 May 2022
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges Zhenghua Chen Min-man Wu Alvin Chan Xiaoli Li Yew-Soon Ong 59 7 0 08 May 2022
Merging of neural networks Martin Pasen Vladimír Boza FedML MoMe 78 2 0 21 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results T. Ridnik Hussam Lawen Emanuel Ben-Baruch Asaf Noy 107 11 0 07 Apr 2022
Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes Samrudhdhi B. Rangrej C. Srinidhi J. Clark 72 12 0 01 Apr 2022
On the benefits of knowledge distillation for adversarial robustness Javier Maroto Guillermo Ortiz-Jiménez P. Frossard AAML FedML 74 20 0 14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification Yuan Gong Sameer Khurana Andrew Rouditchenko James R. Glass VLM 73 29 0 13 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Mitchell Wortsman Gabriel Ilharco S. Gadre Rebecca Roelofs Raphael Gontijo-Lopes ... Hongseok Namkoong Ali Farhadi Y. Carmon Simon Kornblith Ludwig Schmidt MoMe 205 1,013 1 10 Mar 2022
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning Weixin Liang Yuhui Zhang Yongchan Kwon Serena Yeung James Zou VLM 149 430 0 03 Mar 2022
Meta Knowledge Distillation Jihao Liu Boxiao Liu Hongsheng Li Yu Liu 85 26 0 16 Feb 2022
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing Emanuel Ben-Baruch M. Karklinsky Yossi Biton Avi Ben-Cohen Hussam Lawen Nadav Zamir 58 12 0 18 Jan 2022
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation K. Navaneet Soroush Abbasi Koohpayegani Ajinkya Tejankar Hamed Pirsiavash 63 20 0 13 Jan 2022
Microdosing: Knowledge Distillation for GAN based Compression Leonhard Helminger Roberto Azevedo Abdelaziz Djelouah Markus Gross Christopher Schroers 49 3 0 07 Jan 2022
Ex-Model: Continual Learning from a Stream of Trained Models Antonio Carta Andrea Cossu Vincenzo Lomonaco D. Bacciu CLL 50 12 0 13 Dec 2021
A Fast Knowledge Distillation Framework for Visual Recognition Zhiqiang Shen Eric P. Xing VLM 112 50 0 02 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image Yuki M. Asano Aaqib Saeed 92 7 0 01 Dec 2021
PP-ShiTu: A Practical Lightweight Image Recognition System Shengyun Wei Ruoyu Guo Cheng Cui Bin Lu Shuilong Dong ... Xueying Lyu Qiwen Liu Xiaoguang Hu Dianhai Yu Yanjun Ma CVBM 144 6 0 01 Nov 2021
Network Augmentation for Tiny Deep Learning Han Cai Chuang Gan Ji Lin Song Han 133 30 0 17 Oct 2021
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR Khoi Duc Minh Nguyen Y. Nguyen Bao Le 57 5 0 02 Aug 2021
Teacher's pet: understanding and mitigating biases in distillation Michal Lukasik Srinadh Bhojanapalli A. Menon Sanjiv Kumar 80 25 0 19 Jun 2021
Does Knowledge Distillation Really Work? Samuel Stanton Pavel Izmailov Polina Kirichenko Alexander A. Alemi A. Wilson FedML 77 225 0 10 Jun 2021
On Improving Adversarial Transferability of Vision Transformers Muzammal Naseer Kanchana Ranasinghe Salman Khan Fahad Shahbaz Khan Fatih Porikli ViT 103 95 0 08 Jun 2021