Dual-Space Knowledge Distillation for Large Language Models

Dual-Space Knowledge Distillation for Large Language Models

25 June 2024

Yufeng Chen

Jinan Xu

Papers citing "Dual-Space Knowledge Distillation for Large Language Models"

6 / 6 papers shown

Title
Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom Rishika Sen Sujoy Roychowdhury Sumit Soman H. G. Ranjani Srikhetra Mohanty 66 0 0 28 Apr 2025
A Dual-Space Framework for General Knowledge Distillation of Large Language Models Xuzhi Zhang Songming Zhang Yunlong Liang Fandong Meng Yufeng Chen Jinan Xu Jie Zhou 26 0 0 15 Apr 2025
Cross-Tokenizer Distillation via Approximate Likelihood Matching Benjamin Minixhofer Ivan Vulić E. Ponti 151 0 0 25 Mar 2025
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling Haebin Shin Lei Ji Xiao Liu Yeyun Gong 57 0 0 24 Mar 2025
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Jongwoo Ko Tianyi Chen Sungnyun Kim Tianyu Ding Luming Liang Ilya Zharkov Se-Young Yun VLM 168 0 0 10 Mar 2025
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 261 4,489 0 23 Jan 2020