Self-Distillation Amplifies Regularization in Hilbert Space

13 February 2020

Papers citing "Self-Distillation Amplifies Regularization in Hilbert Space"

49 / 149 papers shown

Title
Combining Diverse Feature Priors Saachi Jain Dimitris Tsipras A. Madry 64 14 0 15 Oct 2021
Instance-based Label Smoothing For Better Calibrated Classification Networks Mohamed Maher Meelis Kull UQCV 16 7 0 11 Oct 2021
Kernel Interpolation as a Bayes Point Machine Jeremy Bernstein Alexander R. Farhang Yisong Yue BDL 32 4 0 08 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in Generalization Sara Fridovich-Keil Raphael Gontijo-Lopes Rebecca Roelofs 41 28 0 06 Oct 2021
Deep Neural Compression Via Concurrent Pruning and Self-Distillation J. Ó. Neill Sourav Dutta H. Assem VLM 21 5 0 30 Sep 2021
Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations Fangyu Liu Yunlong Jiao Jordan Massiah Emine Yilmaz Serhii Havrylov SSL 95 29 0 27 Sep 2021
Self-Training with Differentiable Teacher Simiao Zuo Yue Yu Chen Liang Haoming Jiang Siawpeng Er Chao Zhang T. Zhao H. Zha 41 14 0 15 Sep 2021
Learning Energy-Based Approximate Inference Networks for Structured Applications in NLP Lifu Tu BDL 35 0 0 27 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach Benjamin Ampel Sagar Samtani Steven Ullman Hsinchun Chen 25 35 0 03 Aug 2021
Iterative Distillation for Better Uncertainty Estimates in Multitask Emotion Recognition Didan Deng Liang Wu Bertram E. Shi 46 32 0 21 Jul 2021
Confidence Conditioned Knowledge Distillation Sourav Mishra Suresh Sundaram 15 1 0 06 Jul 2021
Fair Visual Recognition in Limited Data Regime using Self-Supervision and Self-Distillation Pratik Mazumder Pravendra Singh Vinay P. Namboodiri SSL 16 3 0 30 Jun 2021
R-Drop: Regularized Dropout for Neural Networks Xiaobo Liang Lijun Wu Juntao Li Yue Wang Qi Meng Tao Qin Wei Chen Hao Fei Tie-Yan Liu 47 424 0 28 Jun 2021
Midpoint Regularization: from High Uncertainty Training to Conservative Classification Hongyu Guo 23 3 0 26 Jun 2021
Teacher's pet: understanding and mitigating biases in distillation Michal Lukasik Srinadh Bhojanapalli A. Menon Sanjiv Kumar 18 25 0 19 Jun 2021
What can linearized neural networks actually say about generalization? Guillermo Ortiz-Jiménez Seyed-Mohsen Moosavi-Dezfooli P. Frossard 29 43 0 12 Jun 2021
Generate, Annotate, and Learn: NLP with Synthetic Text Xuanli He Islam Nassar J. Kiros Gholamreza Haffari Mohammad Norouzi 39 51 0 11 Jun 2021
Does Knowledge Distillation Really Work? Samuel Stanton Pavel Izmailov Polina Kirichenko Alexander A. Alemi A. Wilson FedML 24 215 0 10 Jun 2021
Churn Reduction via Distillation Heinrich Jiang Harikrishna Narasimhan Dara Bahri Andrew Cotter Afshin Rostamizadeh 29 15 0 04 Jun 2021
Anchor-based Plain Net for Mobile Image Super-Resolution Zongcai Du Jie Liu Jie Tang Gangshan Wu SupR MQ 30 52 0 20 May 2021
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation Taehyeon Kim Jaehoon Oh Nakyil Kim Sangwook Cho Se-Young Yun 12 228 0 19 May 2021
Knowledge Distillation as Semiparametric Inference Tri Dao G. Kamath Vasilis Syrgkanis Lester W. Mackey 40 31 0 20 Apr 2021
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion Recognition Liam Schoneveld Alice Othmani Hazem Abdelkawy 24 179 0 16 Mar 2021
On The Effect of Auxiliary Tasks on Representation Dynamics Clare Lyle Mark Rowland Georg Ostrovski Will Dabney 29 68 0 25 Feb 2021
Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation Kenneth Borup L. Andersen 25 14 0 25 Feb 2021
Localization Distillation for Dense Object Detection Zhaohui Zheng Rongguang Ye Ping Wang Dongwei Ren W. Zuo Qibin Hou Ming-Ming Cheng ObjD 104 115 0 24 Feb 2021
Essentials for Class Incremental Learning Sudhanshu Mittal Silvio Galesso Thomas Brox CLL 19 96 0 18 Feb 2021
Distilling Double Descent Andrew Cotter A. Menon Harikrishna Narasimhan A. S. Rawat Sashank J. Reddi Yichen Zhou 25 7 0 13 Feb 2021
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 60 355 0 17 Dec 2020
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning Yiding Jiang Pierre Foret Scott Yak Daniel M. Roy H. Mobahi Gintare Karolina Dziugaite Samy Bengio Suriya Gunasekar Isabelle M Guyon Behnam Neyshabur Google Research OOD 24 55 0 14 Dec 2020
Regularization via Adaptive Pairwise Label Smoothing Hongyu Guo 26 0 0 02 Dec 2020
Run Away From your Teacher: Understanding BYOL by a Novel Self-Supervised Approach Haizhou Shi Dongliang Luo Siliang Tang Jian Wang Yueting Zhuang SSL 21 13 0 22 Nov 2020
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning Aviral Kumar Rishabh Agarwal Dibya Ghosh Sergey Levine OffRL 22 117 0 27 Oct 2020
Iterative Graph Self-Distillation Hanlin Zhang Shuai Lin Weiyang Liu Pan Zhou Jian Tang Xiaodan Liang Eric P. Xing SSL 57 33 0 23 Oct 2020
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher Guangda Ji Zhanxing Zhu 59 42 0 20 Oct 2020
Tatum-Level Drum Transcription Based on a Convolutional Recurrent Neural Network with Language Model-Based Regularized Training Ryoto Ishizuka Ryo Nishikimi Eita Nakamura Kazuyoshi Yoshii 22 4 0 08 Oct 2020
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data Colin Wei Kendrick Shen Yining Chen Tengyu Ma SSL 23 224 0 07 Oct 2020
Improving QA Generalization by Concurrent Modeling of Multiple Biases Mingzhu Wu N. Moosavi Andreas Rucklé Iryna Gurevych AI4CE 15 17 0 07 Oct 2020
Noisy Self-Knowledge Distillation for Text Summarization Yang Liu S. Shen Mirella Lapata 33 44 0 15 Sep 2020
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models T. Wang Selen Pehlivan Jorma T. Laaksonen 29 34 0 18 Aug 2020
Density Fixing: Simple yet Effective Regularization Method based on the Class Prior Masanari Kimura Ryohei Izawa 8 1 0 08 Jul 2020
On the Demystification of Knowledge Distillation: A Residual Network Perspective N. Jha Rajat Saini Sparsh Mittal 18 4 0 30 Jun 2020
Transient Non-Stationarity and Generalisation in Deep Reinforcement Learning Maximilian Igl Gregory Farquhar Jelena Luketina Wendelin Boehmer Shimon Whiteson 27 84 0 10 Jun 2020
Knowledge Distillation: A Survey Jianping Gou B. Yu Stephen J. Maybank Dacheng Tao VLM 19 2,843 0 09 Jun 2020
Transferring Inductive Biases through Knowledge Distillation Samira Abnar Mostafa Dehghani Willem H. Zuidema 27 57 0 31 May 2020
Why distillation helps: a statistical perspective A. Menon A. S. Rawat Sashank J. Reddi Seungyeon Kim Sanjiv Kumar FedML 25 22 0 21 May 2020
Imitation Attacks and Defenses for Black-box Machine Translation Systems Eric Wallace Mitchell Stern D. Song AAML 14 119 0 30 Apr 2020
Dropout as an Implicit Gating Mechanism For Continual Learning Seyed Iman Mirzadeh Mehrdad Farajtabar H. Ghasemzadeh CLL 20 42 0 24 Apr 2020
Knowledge Distillation by On-the-Fly Native Ensemble Xu Lan Xiatian Zhu S. Gong 209 473 0 12 Jun 2018