Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.05065
Cited By
Self-Distillation as Instance-Specific Label Smoothing
9 June 2020
Zhilu Zhang
M. Sabuncu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Distillation as Instance-Specific Label Smoothing"
17 / 17 papers shown
Title
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs
Jiliang Ni
Jiachen Pu
Zhongyi Yang
Kun Zhou
Hui Wang
Xiaoliang Xiao
Dakui Wang
Xin Li
Jingfeng Luo
Conggang Hu
37
0
0
18 Apr 2025
sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging
Jingyuan Chen
Yuan Yao
Mie Anderson
Natalie Hauglund
Celia Kjaerby
Verena Untiet
Maiken Nedergaard
Jiebo Luo
49
1
0
28 Jan 2025
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
Kaito Takanami
Takashi Takahashi
Ayaka Sakata
40
0
0
27 Jan 2025
Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection
Jash Dalvi
Ali Dabouei
Gunjan Dhanuka
Min Xu
23
0
0
05 Jun 2024
Trans-LoRA
\textit{Trans-LoRA}
Trans-LoRA
: towards data-free Transferable Parameter Efficient Finetuning
Runqian Wang
Soumya Ghosh
David D. Cox
Diego Antognini
Aude Oliva
Rogerio Feris
Leonid Karlinsky
37
1
0
27 May 2024
An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation
Md Arafat Sultan
Aashka Trivedi
Parul Awasthy
Avirup Sil
30
0
0
12 Jan 2024
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Qihuang Zhong
Liang Ding
Juhua Liu
Xuebo Liu
Min Zhang
Bo Du
Dacheng Tao
VLM
34
9
0
24 May 2023
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang
Mengqi Xue
Jiangtao Zhang
Haofei Zhang
Yu Wang
Lechao Cheng
Jie Song
Mingli Song
28
5
0
26 Mar 2023
Boosting Graph Neural Networks via Adaptive Knowledge Distillation
Zhichun Guo
Chunhui Zhang
Yujie Fan
Yijun Tian
Chuxu Zhang
Nitesh V. Chawla
21
32
0
12 Oct 2022
Spot-adaptive Knowledge Distillation
Jie Song
Ying Chen
Jingwen Ye
Mingli Song
20
72
0
05 May 2022
Better Supervisory Signals by Observing Learning Paths
Yi Ren
Shangmin Guo
Danica J. Sutherland
33
21
0
04 Mar 2022
Dynamic Rectification Knowledge Distillation
Fahad Rahman Amik
Ahnaf Ismat Tasin
Silvia Ahmed
M. M. L. Elahi
Nabeel Mohammed
26
5
0
27 Jan 2022
Learning with Label Noise for Image Retrieval by Selecting Interactions
Sarah Ibrahimi
Arnaud Sors
Rafael Sampaio de Rezende
S. Clinchant
NoLa
VLM
24
16
0
20 Dec 2021
Federated Few-Shot Learning with Adversarial Learning
Chenyou Fan
Jianwei Huang
FedML
13
29
0
01 Apr 2021
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
19
2,843
0
09 Jun 2020
Knowledge Distillation by On-the-Fly Native Ensemble
Xu Lan
Xiatian Zhu
S. Gong
195
473
0
12 Jun 2018
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
Antti Tarvainen
Harri Valpola
OOD
MoMe
261
1,275
0
06 Mar 2017
1