Self-Distillation as Instance-Specific Label Smoothing

9 June 2020

Papers citing "Self-Distillation as Instance-Specific Label Smoothing"

17 / 17 papers shown

Title
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs Jiliang Ni Jiachen Pu Zhongyi Yang Kun Zhou Hui Wang Xiaoliang Xiao Dakui Wang Xin Li Jingfeng Luo Conggang Hu 37 0 0 18 Apr 2025
sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging Jingyuan Chen Yuan Yao Mie Anderson Natalie Hauglund Celia Kjaerby Verena Untiet Maiken Nedergaard Jiebo Luo 49 1 0 28 Jan 2025
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model Kaito Takanami Takashi Takahashi Ayaka Sakata 40 0 0 27 Jan 2025
Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection Jash Dalvi Ali Dabouei Gunjan Dhanuka Min Xu 23 0 0 05 Jun 2024
$$\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning$ $\textit{Trans-LoRA}$ : towards data-free Transferable Parameter Efficient Finetuning Runqian Wang Soumya Ghosh David D. Cox Diego Antognini Aude Oliva Rogerio Feris Leonid Karlinsky 37 1 0 27 May 2024
An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation Md Arafat Sultan Aashka Trivedi Parul Awasthy Avirup Sil 32 0 0 12 Jan 2024
Revisiting Token Dropping Strategy in Efficient BERT Pretraining Qihuang Zhong Liang Ding Juhua Liu Xuebo Liu Min Zhang Bo Du Dacheng Tao VLM 34 9 0 24 May 2023
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation Tianli Zhang Mengqi Xue Jiangtao Zhang Haofei Zhang Yu Wang Lechao Cheng Jie Song Mingli Song 28 5 0 26 Mar 2023
Boosting Graph Neural Networks via Adaptive Knowledge Distillation Zhichun Guo Chunhui Zhang Yujie Fan Yijun Tian Chuxu Zhang Nitesh V. Chawla 21 32 0 12 Oct 2022
Spot-adaptive Knowledge Distillation Jie Song Ying Chen Jingwen Ye Mingli Song 20 72 0 05 May 2022
Better Supervisory Signals by Observing Learning Paths Yi Ren Shangmin Guo Danica J. Sutherland 33 21 0 04 Mar 2022
Dynamic Rectification Knowledge Distillation Fahad Rahman Amik Ahnaf Ismat Tasin Silvia Ahmed M. M. L. Elahi Nabeel Mohammed 26 5 0 27 Jan 2022
Learning with Label Noise for Image Retrieval by Selecting Interactions Sarah Ibrahimi Arnaud Sors Rafael Sampaio de Rezende S. Clinchant NoLa VLM 24 16 0 20 Dec 2021
Federated Few-Shot Learning with Adversarial Learning Chenyou Fan Jianwei Huang FedML 13 29 0 01 Apr 2021
Knowledge Distillation: A Survey Jianping Gou B. Yu Stephen J. Maybank Dacheng Tao VLM 19 2,843 0 09 Jun 2020
Knowledge Distillation by On-the-Fly Native Ensemble Xu Lan Xiatian Zhu S. Gong 197 473 0 12 Jun 2018
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results Antti Tarvainen Harri Valpola OOD MoMe 264 1,275 0 06 Mar 2017