Continual Distillation Learning: Knowledge Distillation in Prompt-based Continual Learning

We introduce the problem of continual distillation learning (CDL) in order to use knowledge distillation (KD) to improve prompt-based continual learning (CL) models. The CDL problem is valuable to study since the use of a larger vision transformer (ViT) leads to better performance in prompt-based continual learning. The distillation of knowledge from a large ViT to a small ViT can improve the inference efficiency for prompt-based CL models. We empirically found that existing KD methods such as logit distillation and feature distillation cannot effectively improve the student model in the CDL setup. To this end, we introduce a novel method named Knowledge Distillation based on Prompts (KDP), in which globally accessible prompts specifically designed for knowledge distillation are inserted into the frozen ViT backbone of the student model. We demonstrate that our KDP method effectively enhances the distillation performance in comparison to existing KD methods in the CDL setup.
View on arXiv@article{zhang2025_2407.13911, title={ Continual Distillation Learning: Knowledge Distillation in Prompt-based Continual Learning }, author={ Qifan Zhang and Yunhui Guo and Yu Xiang }, journal={arXiv preprint arXiv:2407.13911}, year={ 2025 } }