ALP-KD: Attention-Based Layer Projection for Knowledge Distillation

27 December 2020

Qun Liu

Papers citing "ALP-KD: Attention-Based Layer Projection for Knowledge Distillation"

20 / 20 papers shown

Title
Choosing Wisely and Learning Deeply: Selective Cross-Modality Distillation via CLIP for Domain Generalization Jixuan Leng Yijiang Li Haohan Wang VLM 37 0 0 26 Nov 2023
GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model Shicheng Tan Weng Lam Tam Yuanchun Wang Wenwen Gong Yang Yang ... Jiahao Liu Jingang Wang Shuo Zhao Peng Zhang Jie Tang ALM MoE 33 11 0 11 Jun 2023
Student-friendly Knowledge Distillation Mengyang Yuan Bo Lang Fengnan Quan 20 17 0 18 May 2023
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society Ge Li Hasan Hammoud Hani Itani Dmitrii Khizbullin Guohao Li SyDa ALM 50 412 0 31 Mar 2023
Graph-based Knowledge Distillation: A survey and experimental evaluation Jing Liu Tongya Zheng Guanzheng Zhang Qinfen Hao 33 8 0 27 Feb 2023
Audio Representation Learning by Distilling Video as Privileged Information Amirhossein Hajavi Ali Etemad 21 4 0 06 Feb 2023
LEAD: Liberal Feature-based Distillation for Dense Retrieval Hao Sun Xiao Liu Yeyun Gong Anlei Dong Jing Lu Yan Zhang Linjun Yang Rangan Majumder Nan Duan 67 2 0 10 Dec 2022
Knowledge Distillation of Transformer-based Language Models Revisited Chengqiang Lu Jianwei Zhang Yunfei Chu Zhengyu Chen Jingren Zhou Fei Wu Haiqing Chen Hongxia Yang VLM 27 10 0 29 Jun 2022
CILDA: Contrastive Data Augmentation using Intermediate Layer Knowledge Distillation Md. Akmal Haidar Mehdi Rezagholizadeh Abbas Ghaddar Khalil Bibi Philippe Langlais Pascal Poupart CLL 33 6 0 15 Apr 2022
Enabling All In-Edge Deep Learning: A Literature Review Praveen Joshi Mohammed Hasanuzzaman Chandra Thapa Haithem Afli T. Scully 43 22 0 07 Apr 2022
How and When Adversarial Robustness Transfers in Knowledge Distillation? Rulin Shao Ming Zhou C. Bezemer Cho-Jui Hsieh AAML 27 17 0 22 Oct 2021
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher Mehdi Rezagholizadeh A. Jafari Puneeth Salad Pranav Sharma Ali Saheb Pasand A. Ghodsi 81 18 0 16 Oct 2021
A Short Study on Compressing Decoder-Based Language Models Tianda Li Yassir El Mesbahi I. Kobyzev Ahmad Rashid A. Mahmud Nithin Anchuri Habib Hajimolahoseini Yang Liu Mehdi Rezagholizadeh 93 25 0 16 Oct 2021
Kronecker Decomposition for GPT Compression Ali Edalati Marzieh S. Tahaei Ahmad Rashid V. Nia J. Clark Mehdi Rezagholizadeh 36 33 0 15 Oct 2021
RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation Md. Akmal Haidar Nithin Anchuri Mehdi Rezagholizadeh Abbas Ghaddar Philippe Langlais Pascal Poupart 31 22 0 21 Sep 2021
Knowledge Distillation with Noisy Labels for Natural Language Understanding Shivendra Bhardwaj Abbas Ghaddar Ahmad Rashid Khalil Bibi Cheng-huan Li A. Ghodsi Philippe Langlais Mehdi Rezagholizadeh 19 1 0 21 Sep 2021
How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding Tianda Li Ahmad Rashid A. Jafari Pranav Sharma A. Ghodsi Mehdi Rezagholizadeh AAML 33 5 0 13 Sep 2021
Towards Zero-Shot Knowledge Distillation for Natural Language Processing Ahmad Rashid Vasileios Lioutas Abbas Ghaddar Mehdi Rezagholizadeh 21 27 0 31 Dec 2020
Knowledge Distillation: A Survey Jianping Gou B. Yu Stephen J. Maybank Dacheng Tao VLM 19 2,843 0 09 Jun 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 299 6,984 0 20 Apr 2018