$Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network$

Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network

2 October 2019

Papers citing "Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network"

13 / 13 papers shown

Title
Retraining with Predicted Hard Labels Provably Increases Model Accuracy Rudrajit Das Inderjit S Dhillon Alessandro Epasto Adel Javanmard Jieming Mao Vahab Mirrokni Sujay Sanghavi Peilin Zhong 52 1 0 17 Jun 2024
Mixed-Type Wafer Classification For Low Memory Devices Using Knowledge Distillation Nitish Shukla Anurima Dey K. Srivatsan 35 1 0 24 Mar 2023
Respecting Transfer Gap in Knowledge Distillation Yulei Niu Long Chen Chan Zhou Hanwang Zhang 26 23 0 23 Oct 2022
Overview frequency principle/spectral bias in deep learning Z. Xu Yaoyu Zhang Tao Luo FaML 33 66 0 19 Jan 2022
Rethinking Influence Functions of Neural Networks in the Over-parameterized Regime Rui Zhang Shihua Zhang TDI 27 21 0 15 Dec 2021
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher Mehdi Rezagholizadeh A. Jafari Puneeth Salad Pranav Sharma Ali Saheb Pasand A. Ghodsi 79 18 0 16 Oct 2021
Knowledge Distillation with Noisy Labels for Natural Language Understanding Shivendra Bhardwaj Abbas Ghaddar Ahmad Rashid Khalil Bibi Cheng-huan Li A. Ghodsi Philippe Langlais Mehdi Rezagholizadeh 19 1 0 21 Sep 2021
Self-paced Resistance Learning against Overfitting on Noisy Labels Xiaoshuang Shi Zhenhua Guo Fuyong Xing Yun Liang Xiaofeng Zhu NoLa 21 20 0 07 May 2021
Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise Pengfei Chen Junjie Ye Guangyong Chen Jingwei Zhao Pheng-Ann Heng NoLa 40 122 0 10 Dec 2020
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher Guangda Ji Zhanxing Zhu 59 42 0 20 Oct 2020
When Does Preconditioning Help or Hurt Generalization? S. Amari Jimmy Ba Roger C. Grosse Xuechen Li Atsushi Nitanda Taiji Suzuki Denny Wu Ji Xu 36 32 0 18 Jun 2020
Self-Distillation Amplifies Regularization in Hilbert Space H. Mobahi Mehrdad Farajtabar Peter L. Bartlett 19 226 0 13 Feb 2020
SaaS: Speed as a Supervisor for Semi-supervised Learning Safa Cicek Alhussein Fawzi Stefano Soatto BDL 30 19 0 02 May 2018