Autoregressive Knowledge Distillation through Imitation Learning

15 September 2020

Papers citing "Autoregressive Knowledge Distillation through Imitation Learning"

9 / 9 papers shown

Title
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models Jahyun Koo Yerin Hwang Yongil Kim Taegwan Kang Hyunkyung Bae Kyomin Jung 60 0 0 25 Oct 2024
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling Wenyuan Xu Rujun Han Zhenting Wang L. Le Dhruv Madeka Lei Li Luu Anh Tuan Rishabh Agarwal Chen-Yu Lee Tomas Pfister 80 8 0 15 Oct 2024
f-Divergence Minimization for Sequence-Level Knowledge Distillation Yuqiao Wen Zichao Li Wenyu Du Lili Mou 30 53 0 27 Jul 2023
Target-Side Augmentation for Document-Level Machine Translation Guangsheng Bao Zhiyang Teng Yue Zhang 37 10 0 08 May 2023
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference Tao Lei Junwen Bai Siddhartha Brahma Joshua Ainslie Kenton Lee ... Vincent Zhao Yuexin Wu Bo-wen Li Yu Zhang Ming-Wei Chang BDL AI4CE 30 55 0 11 Apr 2023
Improving Scheduled Sampling with Elastic Weight Consolidation for Neural Machine Translation Michalis Korakakis Andreas Vlachos CLL 31 2 0 13 Sep 2021
Teaching Autoregressive Language Models Complex Tasks By Demonstration Gabriel Recchia 26 22 0 05 Sep 2021
Text Summarization with Pretrained Encoders Yang Liu Mirella Lapata MILM 258 1,433 0 22 Aug 2019
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 218 7,926 0 17 Aug 2015