Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.05715
Cited By
Self-Distillation Amplifies Regularization in Hilbert Space
13 February 2020
H. Mobahi
Mehrdad Farajtabar
Peter L. Bartlett
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Distillation Amplifies Regularization in Hilbert Space"
49 / 149 papers shown
Title
Combining Diverse Feature Priors
Saachi Jain
Dimitris Tsipras
A. Madry
64
14
0
15 Oct 2021
Instance-based Label Smoothing For Better Calibrated Classification Networks
Mohamed Maher
Meelis Kull
UQCV
16
7
0
11 Oct 2021
Kernel Interpolation as a Bayes Point Machine
Jeremy Bernstein
Alexander R. Farhang
Yisong Yue
BDL
32
4
0
08 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in Generalization
Sara Fridovich-Keil
Raphael Gontijo-Lopes
Rebecca Roelofs
41
28
0
06 Oct 2021
Deep Neural Compression Via Concurrent Pruning and Self-Distillation
J. Ó. Neill
Sourav Dutta
H. Assem
VLM
21
5
0
30 Sep 2021
Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations
Fangyu Liu
Yunlong Jiao
Jordan Massiah
Emine Yilmaz
Serhii Havrylov
SSL
95
29
0
27 Sep 2021
Self-Training with Differentiable Teacher
Simiao Zuo
Yue Yu
Chen Liang
Haoming Jiang
Siawpeng Er
Chao Zhang
T. Zhao
H. Zha
41
14
0
15 Sep 2021
Learning Energy-Based Approximate Inference Networks for Structured Applications in NLP
Lifu Tu
BDL
35
0
0
27 Aug 2021
Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach
Benjamin Ampel
Sagar Samtani
Steven Ullman
Hsinchun Chen
25
35
0
03 Aug 2021
Iterative Distillation for Better Uncertainty Estimates in Multitask Emotion Recognition
Didan Deng
Liang Wu
Bertram E. Shi
46
32
0
21 Jul 2021
Confidence Conditioned Knowledge Distillation
Sourav Mishra
Suresh Sundaram
15
1
0
06 Jul 2021
Fair Visual Recognition in Limited Data Regime using Self-Supervision and Self-Distillation
Pratik Mazumder
Pravendra Singh
Vinay P. Namboodiri
SSL
16
3
0
30 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
47
424
0
28 Jun 2021
Midpoint Regularization: from High Uncertainty Training to Conservative Classification
Hongyu Guo
23
3
0
26 Jun 2021
Teacher's pet: understanding and mitigating biases in distillation
Michal Lukasik
Srinadh Bhojanapalli
A. Menon
Sanjiv Kumar
18
25
0
19 Jun 2021
What can linearized neural networks actually say about generalization?
Guillermo Ortiz-Jiménez
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
29
43
0
12 Jun 2021
Generate, Annotate, and Learn: NLP with Synthetic Text
Xuanli He
Islam Nassar
J. Kiros
Gholamreza Haffari
Mohammad Norouzi
39
51
0
11 Jun 2021
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
24
215
0
10 Jun 2021
Churn Reduction via Distillation
Heinrich Jiang
Harikrishna Narasimhan
Dara Bahri
Andrew Cotter
Afshin Rostamizadeh
29
15
0
04 Jun 2021
Anchor-based Plain Net for Mobile Image Super-Resolution
Zongcai Du
Jie Liu
Jie Tang
Gangshan Wu
SupR
MQ
30
52
0
20 May 2021
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation
Taehyeon Kim
Jaehoon Oh
Nakyil Kim
Sangwook Cho
Se-Young Yun
12
228
0
19 May 2021
Knowledge Distillation as Semiparametric Inference
Tri Dao
G. Kamath
Vasilis Syrgkanis
Lester W. Mackey
40
31
0
20 Apr 2021
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion Recognition
Liam Schoneveld
Alice Othmani
Hazem Abdelkawy
24
179
0
16 Mar 2021
On The Effect of Auxiliary Tasks on Representation Dynamics
Clare Lyle
Mark Rowland
Georg Ostrovski
Will Dabney
29
68
0
25 Feb 2021
Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation
Kenneth Borup
L. Andersen
25
14
0
25 Feb 2021
Localization Distillation for Dense Object Detection
Zhaohui Zheng
Rongguang Ye
Ping Wang
Dongwei Ren
W. Zuo
Qibin Hou
Ming-Ming Cheng
ObjD
104
115
0
24 Feb 2021
Essentials for Class Incremental Learning
Sudhanshu Mittal
Silvio Galesso
Thomas Brox
CLL
19
96
0
18 Feb 2021
Distilling Double Descent
Andrew Cotter
A. Menon
Harikrishna Narasimhan
A. S. Rawat
Sashank J. Reddi
Yichen Zhou
25
7
0
13 Feb 2021
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
60
355
0
17 Dec 2020
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning
Yiding Jiang
Pierre Foret
Scott Yak
Daniel M. Roy
H. Mobahi
Gintare Karolina Dziugaite
Samy Bengio
Suriya Gunasekar
Isabelle M Guyon
Behnam Neyshabur Google Research
OOD
24
55
0
14 Dec 2020
Regularization via Adaptive Pairwise Label Smoothing
Hongyu Guo
26
0
0
02 Dec 2020
Run Away From your Teacher: Understanding BYOL by a Novel Self-Supervised Approach
Haizhou Shi
Dongliang Luo
Siliang Tang
Jian Wang
Yueting Zhuang
SSL
21
13
0
22 Nov 2020
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
Aviral Kumar
Rishabh Agarwal
Dibya Ghosh
Sergey Levine
OffRL
22
117
0
27 Oct 2020
Iterative Graph Self-Distillation
Hanlin Zhang
Shuai Lin
Weiyang Liu
Pan Zhou
Jian Tang
Xiaodan Liang
Eric P. Xing
SSL
57
33
0
23 Oct 2020
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Guangda Ji
Zhanxing Zhu
59
42
0
20 Oct 2020
Tatum-Level Drum Transcription Based on a Convolutional Recurrent Neural Network with Language Model-Based Regularized Training
Ryoto Ishizuka
Ryo Nishikimi
Eita Nakamura
Kazuyoshi Yoshii
22
4
0
08 Oct 2020
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data
Colin Wei
Kendrick Shen
Yining Chen
Tengyu Ma
SSL
23
224
0
07 Oct 2020
Improving QA Generalization by Concurrent Modeling of Multiple Biases
Mingzhu Wu
N. Moosavi
Andreas Rucklé
Iryna Gurevych
AI4CE
15
17
0
07 Oct 2020
Noisy Self-Knowledge Distillation for Text Summarization
Yang Liu
S. Shen
Mirella Lapata
33
44
0
15 Sep 2020
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models
T. Wang
Selen Pehlivan
Jorma T. Laaksonen
29
34
0
18 Aug 2020
Density Fixing: Simple yet Effective Regularization Method based on the Class Prior
Masanari Kimura
Ryohei Izawa
8
1
0
08 Jul 2020
On the Demystification of Knowledge Distillation: A Residual Network Perspective
N. Jha
Rajat Saini
Sparsh Mittal
18
4
0
30 Jun 2020
Transient Non-Stationarity and Generalisation in Deep Reinforcement Learning
Maximilian Igl
Gregory Farquhar
Jelena Luketina
Wendelin Boehmer
Shimon Whiteson
27
84
0
10 Jun 2020
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
19
2,843
0
09 Jun 2020
Transferring Inductive Biases through Knowledge Distillation
Samira Abnar
Mostafa Dehghani
Willem H. Zuidema
27
57
0
31 May 2020
Why distillation helps: a statistical perspective
A. Menon
A. S. Rawat
Sashank J. Reddi
Seungyeon Kim
Sanjiv Kumar
FedML
25
22
0
21 May 2020
Imitation Attacks and Defenses for Black-box Machine Translation Systems
Eric Wallace
Mitchell Stern
D. Song
AAML
14
119
0
30 Apr 2020
Dropout as an Implicit Gating Mechanism For Continual Learning
Seyed Iman Mirzadeh
Mehrdad Farajtabar
H. Ghasemzadeh
CLL
20
42
0
24 Apr 2020
Knowledge Distillation by On-the-Fly Native Ensemble
Xu Lan
Xiatian Zhu
S. Gong
209
473
0
12 Jun 2018
Previous
1
2
3