ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05715
  4. Cited By
Self-Distillation Amplifies Regularization in Hilbert Space

Self-Distillation Amplifies Regularization in Hilbert Space

13 February 2020
H. Mobahi
Mehrdad Farajtabar
Peter L. Bartlett
ArXivPDFHTML

Papers citing "Self-Distillation Amplifies Regularization in Hilbert Space"

49 / 149 papers shown
Title
Combining Diverse Feature Priors
Combining Diverse Feature Priors
Saachi Jain
Dimitris Tsipras
A. Madry
64
14
0
15 Oct 2021
Instance-based Label Smoothing For Better Calibrated Classification
  Networks
Instance-based Label Smoothing For Better Calibrated Classification Networks
Mohamed Maher
Meelis Kull
UQCV
16
7
0
11 Oct 2021
Kernel Interpolation as a Bayes Point Machine
Kernel Interpolation as a Bayes Point Machine
Jeremy Bernstein
Alexander R. Farhang
Yisong Yue
BDL
32
4
0
08 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in
  Generalization
Spectral Bias in Practice: The Role of Function Frequency in Generalization
Sara Fridovich-Keil
Raphael Gontijo-Lopes
Rebecca Roelofs
41
28
0
06 Oct 2021
Deep Neural Compression Via Concurrent Pruning and Self-Distillation
Deep Neural Compression Via Concurrent Pruning and Self-Distillation
J. Ó. Neill
Sourav Dutta
H. Assem
VLM
21
5
0
30 Sep 2021
Trans-Encoder: Unsupervised sentence-pair modelling through self- and
  mutual-distillations
Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations
Fangyu Liu
Yunlong Jiao
Jordan Massiah
Emine Yilmaz
Serhii Havrylov
SSL
95
29
0
27 Sep 2021
Self-Training with Differentiable Teacher
Self-Training with Differentiable Teacher
Simiao Zuo
Yue Yu
Chen Liang
Haoming Jiang
Siawpeng Er
Chao Zhang
T. Zhao
H. Zha
41
14
0
15 Sep 2021
Learning Energy-Based Approximate Inference Networks for Structured
  Applications in NLP
Learning Energy-Based Approximate Inference Networks for Structured Applications in NLP
Lifu Tu
BDL
35
0
0
27 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK
  Framework: A Self-Distillation Approach
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach
Benjamin Ampel
Sagar Samtani
Steven Ullman
Hsinchun Chen
25
35
0
03 Aug 2021
Iterative Distillation for Better Uncertainty Estimates in Multitask
  Emotion Recognition
Iterative Distillation for Better Uncertainty Estimates in Multitask Emotion Recognition
Didan Deng
Liang Wu
Bertram E. Shi
46
32
0
21 Jul 2021
Confidence Conditioned Knowledge Distillation
Confidence Conditioned Knowledge Distillation
Sourav Mishra
Suresh Sundaram
15
1
0
06 Jul 2021
Fair Visual Recognition in Limited Data Regime using Self-Supervision
  and Self-Distillation
Fair Visual Recognition in Limited Data Regime using Self-Supervision and Self-Distillation
Pratik Mazumder
Pravendra Singh
Vinay P. Namboodiri
SSL
16
3
0
30 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
47
424
0
28 Jun 2021
Midpoint Regularization: from High Uncertainty Training to Conservative
  Classification
Midpoint Regularization: from High Uncertainty Training to Conservative Classification
Hongyu Guo
23
3
0
26 Jun 2021
Teacher's pet: understanding and mitigating biases in distillation
Teacher's pet: understanding and mitigating biases in distillation
Michal Lukasik
Srinadh Bhojanapalli
A. Menon
Sanjiv Kumar
18
25
0
19 Jun 2021
What can linearized neural networks actually say about generalization?
What can linearized neural networks actually say about generalization?
Guillermo Ortiz-Jiménez
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
29
43
0
12 Jun 2021
Generate, Annotate, and Learn: NLP with Synthetic Text
Generate, Annotate, and Learn: NLP with Synthetic Text
Xuanli He
Islam Nassar
J. Kiros
Gholamreza Haffari
Mohammad Norouzi
39
51
0
11 Jun 2021
Does Knowledge Distillation Really Work?
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
24
215
0
10 Jun 2021
Churn Reduction via Distillation
Churn Reduction via Distillation
Heinrich Jiang
Harikrishna Narasimhan
Dara Bahri
Andrew Cotter
Afshin Rostamizadeh
29
15
0
04 Jun 2021
Anchor-based Plain Net for Mobile Image Super-Resolution
Anchor-based Plain Net for Mobile Image Super-Resolution
Zongcai Du
Jie Liu
Jie Tang
Gangshan Wu
SupR
MQ
30
52
0
20 May 2021
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in
  Knowledge Distillation
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation
Taehyeon Kim
Jaehoon Oh
Nakyil Kim
Sangwook Cho
Se-Young Yun
12
228
0
19 May 2021
Knowledge Distillation as Semiparametric Inference
Knowledge Distillation as Semiparametric Inference
Tri Dao
G. Kamath
Vasilis Syrgkanis
Lester W. Mackey
40
31
0
20 Apr 2021
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion
  Recognition
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion Recognition
Liam Schoneveld
Alice Othmani
Hazem Abdelkawy
24
179
0
16 Mar 2021
On The Effect of Auxiliary Tasks on Representation Dynamics
On The Effect of Auxiliary Tasks on Representation Dynamics
Clare Lyle
Mark Rowland
Georg Ostrovski
Will Dabney
29
68
0
25 Feb 2021
Even your Teacher Needs Guidance: Ground-Truth Targets Dampen
  Regularization Imposed by Self-Distillation
Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation
Kenneth Borup
L. Andersen
25
14
0
25 Feb 2021
Localization Distillation for Dense Object Detection
Localization Distillation for Dense Object Detection
Zhaohui Zheng
Rongguang Ye
Ping Wang
Dongwei Ren
W. Zuo
Qibin Hou
Ming-Ming Cheng
ObjD
104
115
0
24 Feb 2021
Essentials for Class Incremental Learning
Essentials for Class Incremental Learning
Sudhanshu Mittal
Silvio Galesso
Thomas Brox
CLL
19
96
0
18 Feb 2021
Distilling Double Descent
Distilling Double Descent
Andrew Cotter
A. Menon
Harikrishna Narasimhan
A. S. Rawat
Sashank J. Reddi
Yichen Zhou
25
7
0
13 Feb 2021
Towards Understanding Ensemble, Knowledge Distillation and
  Self-Distillation in Deep Learning
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
60
355
0
17 Dec 2020
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning
Yiding Jiang
Pierre Foret
Scott Yak
Daniel M. Roy
H. Mobahi
Gintare Karolina Dziugaite
Samy Bengio
Suriya Gunasekar
Isabelle M Guyon
Behnam Neyshabur Google Research
OOD
24
55
0
14 Dec 2020
Regularization via Adaptive Pairwise Label Smoothing
Regularization via Adaptive Pairwise Label Smoothing
Hongyu Guo
26
0
0
02 Dec 2020
Run Away From your Teacher: Understanding BYOL by a Novel
  Self-Supervised Approach
Run Away From your Teacher: Understanding BYOL by a Novel Self-Supervised Approach
Haizhou Shi
Dongliang Luo
Siliang Tang
Jian Wang
Yueting Zhuang
SSL
21
13
0
22 Nov 2020
Implicit Under-Parameterization Inhibits Data-Efficient Deep
  Reinforcement Learning
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
Aviral Kumar
Rishabh Agarwal
Dibya Ghosh
Sergey Levine
OffRL
22
117
0
27 Oct 2020
Iterative Graph Self-Distillation
Iterative Graph Self-Distillation
Hanlin Zhang
Shuai Lin
Weiyang Liu
Pan Zhou
Jian Tang
Xiaodan Liang
Eric P. Xing
SSL
57
33
0
23 Oct 2020
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data
  Efficiency and Imperfect Teacher
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Guangda Ji
Zhanxing Zhu
59
42
0
20 Oct 2020
Tatum-Level Drum Transcription Based on a Convolutional Recurrent Neural
  Network with Language Model-Based Regularized Training
Tatum-Level Drum Transcription Based on a Convolutional Recurrent Neural Network with Language Model-Based Regularized Training
Ryoto Ishizuka
Ryo Nishikimi
Eita Nakamura
Kazuyoshi Yoshii
22
4
0
08 Oct 2020
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled
  Data
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data
Colin Wei
Kendrick Shen
Yining Chen
Tengyu Ma
SSL
23
224
0
07 Oct 2020
Improving QA Generalization by Concurrent Modeling of Multiple Biases
Improving QA Generalization by Concurrent Modeling of Multiple Biases
Mingzhu Wu
N. Moosavi
Andreas Rucklé
Iryna Gurevych
AI4CE
15
17
0
07 Oct 2020
Noisy Self-Knowledge Distillation for Text Summarization
Noisy Self-Knowledge Distillation for Text Summarization
Yang Liu
S. Shen
Mirella Lapata
33
44
0
15 Sep 2020
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced
  Models
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models
T. Wang
Selen Pehlivan
Jorma T. Laaksonen
29
34
0
18 Aug 2020
Density Fixing: Simple yet Effective Regularization Method based on the
  Class Prior
Density Fixing: Simple yet Effective Regularization Method based on the Class Prior
Masanari Kimura
Ryohei Izawa
8
1
0
08 Jul 2020
On the Demystification of Knowledge Distillation: A Residual Network
  Perspective
On the Demystification of Knowledge Distillation: A Residual Network Perspective
N. Jha
Rajat Saini
Sparsh Mittal
18
4
0
30 Jun 2020
Transient Non-Stationarity and Generalisation in Deep Reinforcement
  Learning
Transient Non-Stationarity and Generalisation in Deep Reinforcement Learning
Maximilian Igl
Gregory Farquhar
Jelena Luketina
Wendelin Boehmer
Shimon Whiteson
27
84
0
10 Jun 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
19
2,843
0
09 Jun 2020
Transferring Inductive Biases through Knowledge Distillation
Transferring Inductive Biases through Knowledge Distillation
Samira Abnar
Mostafa Dehghani
Willem H. Zuidema
27
57
0
31 May 2020
Why distillation helps: a statistical perspective
Why distillation helps: a statistical perspective
A. Menon
A. S. Rawat
Sashank J. Reddi
Seungyeon Kim
Sanjiv Kumar
FedML
25
22
0
21 May 2020
Imitation Attacks and Defenses for Black-box Machine Translation Systems
Imitation Attacks and Defenses for Black-box Machine Translation Systems
Eric Wallace
Mitchell Stern
D. Song
AAML
14
119
0
30 Apr 2020
Dropout as an Implicit Gating Mechanism For Continual Learning
Dropout as an Implicit Gating Mechanism For Continual Learning
Seyed Iman Mirzadeh
Mehrdad Farajtabar
H. Ghasemzadeh
CLL
20
42
0
24 Apr 2020
Knowledge Distillation by On-the-Fly Native Ensemble
Knowledge Distillation by On-the-Fly Native Ensemble
Xu Lan
Xiatian Zhu
S. Gong
209
473
0
12 Jun 2018
Previous
123