Hard Gate Knowledge Distillation -- Leverage Calibration for Robust and Reliable Language Model

22 October 2022

Papers citing "Hard Gate Knowledge Distillation -- Leverage Calibration for Robust and Reliable Language Model"

8 / 8 papers shown

Title
Self-Distillation as Instance-Specific Label Smoothing Zhilu Zhang M. Sabuncu 56 118 0 09 Jun 2020
Calibrating Deep Neural Networks using Focal Loss Jishnu Mukhoti Viveka Kulharia Amartya Sanyal Stuart Golodetz Philip Torr P. Dokania UQCV 81 454 0 21 Feb 2020
Understanding and Improving Knowledge Distillation Jiaxi Tang Rakesh Shivanna Zhe Zhao Dong Lin Anima Singh Ed H. Chi Sagar Jain 57 131 0 10 Feb 2020
TinyBERT: Distilling BERT for Natural Language Understanding Xiaoqi Jiao Yichun Yin Lifeng Shang Xin Jiang Xiao Chen Linlin Li F. Wang Qun Liu VLM 75 1,847 0 23 Sep 2019
When Does Label Smoothing Help? Rafael Müller Simon Kornblith Geoffrey E. Hinton UQCV 156 1,931 0 06 Jun 2019
Calibration of Encoder Decoder Models for Neural Machine Translation Aviral Kumar Sunita Sarawagi 124 99 0 03 Mar 2019
Neural Machine Translation of Rare Words with Subword Units Rico Sennrich Barry Haddow Alexandra Birch 174 7,683 0 31 Aug 2015
FitNets: Hints for Thin Deep Nets Adriana Romero Nicolas Ballas Samira Ebrahimi Kahou Antoine Chassang C. Gatta Yoshua Bengio FedML 260 3,862 0 19 Dec 2014