Towards Zero-Shot Knowledge Distillation for Natural Language Processing

31 December 2020

Papers citing "Towards Zero-Shot Knowledge Distillation for Natural Language Processing"

20 / 20 papers shown

Title
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models Junjie Yang Junhao Song Xudong Han Ziqian Bi Tianyang Wang ... Yuyao Zhang Qian Niu Benji Peng Keyu Chen Ming Liu VLM 47 0 0 18 Apr 2025
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey Saurav Pawar S.M. Towhidul Islam Tonmoy S. M. M. Zaman Vinija Jain Aman Chadha Amitava Das 37 27 0 15 Jan 2024
Data-Free Distillation of Language Model by Text-to-Text Transfer Zheyuan Bai Xinduo Liu Hailin Hu Tianyu Guo Qinghua Zhang Yunhe Wang 48 2 0 03 Nov 2023
LLM-Pruner: On the Structural Pruning of Large Language Models Xinyin Ma Gongfan Fang Xinchao Wang 30 364 0 19 May 2023
In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models Yukun Huang Yanda Chen Zhou Yu Kathleen McKeown 27 30 0 20 Dec 2022
Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization A. Jafari I. Kobyzev Mehdi Rezagholizadeh Pascal Poupart A. Ghodsi VLM 20 5 0 12 Dec 2022
Prompting to Distill: Boosting Data-Free Knowledge Distillation via Reinforced Prompt Xinyin Ma Xinchao Wang Gongfan Fang Yongliang Shen Weiming Lu 19 11 0 16 May 2022
CILDA: Contrastive Data Augmentation using Intermediate Layer Knowledge Distillation Md. Akmal Haidar Mehdi Rezagholizadeh Abbas Ghaddar Khalil Bibi Philippe Langlais Pascal Poupart CLL 30 6 0 15 Apr 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation Jiacheng Ye Jiahui Gao Qintong Li Hang Xu Jiangtao Feng Zhiyong Wu Tao Yu Lingpeng Kong SyDa 45 212 0 16 Feb 2022
Data-Free Knowledge Transfer: A Survey Yuang Liu Wei Zhang Jun Wang Jianyong Wang 32 48 0 31 Dec 2021
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher Mehdi Rezagholizadeh A. Jafari Puneeth Salad Pranav Sharma Ali Saheb Pasand A. Ghodsi 76 17 0 16 Oct 2021
A Short Study on Compressing Decoder-Based Language Models Tianda Li Yassir El Mesbahi I. Kobyzev Ahmad Rashid A. Mahmud Nithin Anchuri Habib Hajimolahoseini Yang Liu Mehdi Rezagholizadeh 93 25 0 16 Oct 2021
Improving Question Answering Performance Using Knowledge Distillation and Active Learning Yasaman Boreshban Seyed Morteza Mirbostani Gholamreza Ghassem-Sani Seyed Abolghasem Mirroshandel Shahin Amiriparian 26 15 0 26 Sep 2021
RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation Md. Akmal Haidar Nithin Anchuri Mehdi Rezagholizadeh Abbas Ghaddar Philippe Langlais Pascal Poupart 31 22 0 21 Sep 2021
Knowledge Distillation with Noisy Labels for Natural Language Understanding Shivendra Bhardwaj Abbas Ghaddar Ahmad Rashid Khalil Bibi Cheng-huan Li A. Ghodsi Philippe Langlais Mehdi Rezagholizadeh 19 1 0 21 Sep 2021
How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding Tianda Li Ahmad Rashid A. Jafari Pranav Sharma A. Ghodsi Mehdi Rezagholizadeh AAML 27 5 0 13 Sep 2021
Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax Ehsan Kamalloo Mehdi Rezagholizadeh Peyman Passban Ali Ghodsi AAML 14 17 0 28 May 2021
MATE-KD: Masked Adversarial TExt, a Companion to Knowledge Distillation Ahmad Rashid Vasileios Lioutas Mehdi Rezagholizadeh AAML 13 36 0 12 May 2021
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 233 576 0 12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,959 0 20 Apr 2018