Is Label Smoothing Truly Incompatible with Knowledge Distillation: An
  Empirical Study

Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study

Papers citing "Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study"

48 / 48 papers shown
Title
Knowledge Inheritance for Pre-trained Language Models
Knowledge Inheritance for Pre-trained Language Models
Yujia Qin
Yankai Lin
Jing Yi
Jiajie Zhang
Xu Han
...
Yusheng Su
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
85
50
0
28 May 2021

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.