Representative Teacher Keys for Knowledge Distillation Model Compression
Based on Attention Mechanism for Image Classification

v1v2v3v4 (latest)

Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification

26 June 2022

ArXiv (abs)PDF HTML

Papers citing "Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification"

16 / 16 papers shown

Title
Show, Attend and Distill:Knowledge Distillation via Attention-based Feature Matching Mingi Ji Byeongho Heo Sungrae Park 124 148 0 05 Feb 2021
Refining activation downsampling with SoftPool Alexandros Stergiou R. Poppe Grigorios Kalliatakis 73 160 0 02 Jan 2021
Big Bird: Transformers for Longer Sequences Manzil Zaheer Guru Guruganesh Kumar Avinava Dubey Joshua Ainslie Chris Alberti ... Philip Pham Anirudh Ravula Qifan Wang Li Yang Amr Ahmed VLM 556 2,099 0 28 Jul 2020
Linformer: Self-Attention with Linear Complexity Sinong Wang Belinda Z. Li Madian Khabsa Han Fang Hao Ma 216 1,713 0 08 Jun 2020
Contrastive Representation Distillation Yonglong Tian Dilip Krishnan Phillip Isola 163 1,053 0 23 Oct 2019
Similarity-Preserving Knowledge Distillation Frederick Tung Greg Mori 126 981 0 23 Jul 2019
Learning What and Where to Transfer Yunhun Jang Hankook Lee Sung Ju Hwang Jinwoo Shin 68 151 0 15 May 2019
Generating Long Sequences with Sparse Transformers R. Child Scott Gray Alec Radford Ilya Sutskever 129 1,915 0 23 Apr 2019
Correlation Congruence for Knowledge Distillation Baoyun Peng Xiao Jin Jiaheng Liu Shunfeng Zhou Yichao Wu Yu Liu Dongsheng Li Zhaoning Zhang 94 513 0 03 Apr 2019
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer Sergey Zagoruyko N. Komodakis 147 2,586 0 12 Dec 2016
Adversarial Feature Learning Jiasen Lu Philipp Krahenbuhl Trevor Darrell GAN 127 1,612 0 31 May 2016
Wide Residual Networks Sergey Zagoruyko N. Komodakis 353 8,000 0 23 May 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding Song Han Huizi Mao W. Dally 3DGS 263 8,859 0 01 Oct 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Ke Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhutdinov R. Zemel Yoshua Bengio DiffM 348 10,079 0 10 Feb 2015
FitNets: Hints for Thin Deep Nets Adriana Romero Nicolas Ballas Samira Ebrahimi Kahou Antoine Chassang C. Gatta Yoshua Bengio FedML 319 3,898 0 19 Dec 2014
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 578 27,327 0 01 Sep 2014