ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05715
  4. Cited By
Self-Distillation Amplifies Regularization in Hilbert Space

Self-Distillation Amplifies Regularization in Hilbert Space

13 February 2020
H. Mobahi
Mehrdad Farajtabar
Peter L. Bartlett
ArXivPDFHTML

Papers citing "Self-Distillation Amplifies Regularization in Hilbert Space"

50 / 149 papers shown
Title
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better
  than Chain-of-thought Fine-tuning
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning
Xuekai Zhu
Biqing Qi
Kaiyan Zhang
Xingwei Long
Zhouhan Lin
Bowen Zhou
ALM
LRM
41
19
0
23 May 2023
SATA: Source Anchoring and Target Alignment Network for Continual Test
  Time Adaptation
SATA: Source Anchoring and Target Alignment Network for Continual Test Time Adaptation
Goirik Chakrabarty
Manogna Sreenivas
Soma Biswas
TTA
41
5
0
20 Apr 2023
Simulated Annealing in Early Layers Leads to Better Generalization
Simulated Annealing in Early Layers Leads to Better Generalization
Amirm. Sarfi
Zahra Karimpour
Muawiz Chaudhary
N. Khalid
Mirco Ravanelli
Sudhir Mudur
Eugene Belilovsky
AI4CE
CLL
23
7
0
10 Apr 2023
Self-Distillation for Gaussian Process Regression and Classification
Self-Distillation for Gaussian Process Regression and Classification
Kenneth Borup
L. Andersen
11
2
0
05 Apr 2023
Generalization Matters: Loss Minima Flattening via Parameter
  Hybridization for Efficient Online Knowledge Distillation
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang
Mengqi Xue
Jiangtao Zhang
Haofei Zhang
Yu Wang
Lechao Cheng
Mingli Song
Mingli Song
28
5
0
26 Mar 2023
Self-distillation for surgical action recognition
Self-distillation for surgical action recognition
Amine Yamlahi
T. Tran
Patrick Godau
Melanie Schellenberg
Dominik Michael
...
T. Adler
M. Tizabi
C. Nwoye
N. Padoy
Lena Maier-Hein
38
7
0
22 Mar 2023
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness
  with Dataset Reinforcement
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement
Fartash Faghri
Hadi Pouransari
Sachin Mehta
Mehrdad Farajtabar
Ali Farhadi
Mohammad Rastegari
Oncel Tuzel
43
9
0
15 Mar 2023
Improving Video Retrieval by Adaptive Margin
Improving Video Retrieval by Adaptive Margin
Feng He
Qi Wang
Zhifan Feng
Wenbin Jiang
Yajuan Lü
Yong Zhu
Xiao Tan
88
20
0
09 Mar 2023
Graph-based Knowledge Distillation: A survey and experimental evaluation
Graph-based Knowledge Distillation: A survey and experimental evaluation
Jing Liu
Tongya Zheng
Guanzheng Zhang
Qinfen Hao
33
8
0
27 Feb 2023
Random Teachers are Good Teachers
Random Teachers are Good Teachers
Felix Sarnthein
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
21
4
0
23 Feb 2023
Distilling Calibrated Student from an Uncalibrated Teacher
Distilling Calibrated Student from an Uncalibrated Teacher
Ishan Mishra
Sethu Vamsi Krishna
Deepak Mishra
FedML
40
2
0
22 Feb 2023
Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels
Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels
Zifu Wang
Xuefei Ning
Matthew B. Blaschko
VLM
30
12
0
11 Feb 2023
Understanding Self-Distillation in the Presence of Label Noise
Understanding Self-Distillation in the Presence of Label Noise
Rudrajit Das
Sujay Sanghavi
33
13
0
30 Jan 2023
On student-teacher deviations in distillation: does it pay to disobey?
On student-teacher deviations in distillation: does it pay to disobey?
Vaishnavh Nagarajan
A. Menon
Srinadh Bhojanapalli
H. Mobahi
Surinder Kumar
43
9
0
30 Jan 2023
Knowledge Distillation $\approx$ Label Smoothing: Fact or Fallacy?
Knowledge Distillation ≈\approx≈ Label Smoothing: Fact or Fallacy?
Md Arafat Sultan
22
2
0
30 Jan 2023
Supervision Complexity and its Role in Knowledge Distillation
Supervision Complexity and its Role in Knowledge Distillation
Hrayr Harutyunyan
A. S. Rawat
A. Menon
Seungyeon Kim
Surinder Kumar
30
12
0
28 Jan 2023
A Survey on Reinforcement Learning Security with Application to
  Autonomous Driving
A Survey on Reinforcement Learning Security with Application to Autonomous Driving
Ambra Demontis
Maura Pintor
Luca Demetrio
Kathrin Grosse
Hsiao-Ying Lin
Chengfang Fang
Battista Biggio
Fabio Roli
AAML
42
4
0
12 Dec 2022
Decentralized Learning with Multi-Headed Distillation
Decentralized Learning with Multi-Headed Distillation
A. Zhmoginov
Mark Sandler
Nolan Miller
Gus Kristiansen
Max Vladymyrov
FedML
40
4
0
28 Nov 2022
SADT: Combining Sharpness-Aware Minimization with Self-Distillation for
  Improved Model Generalization
SADT: Combining Sharpness-Aware Minimization with Self-Distillation for Improved Model Generalization
Masud An Nur Islam Fahim
Jani Boutellier
40
0
0
01 Nov 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillation
Fast Yet Effective Speech Emotion Recognition with Self-distillation
Zhao Ren
Thanh Tam Nguyen
Yi Chang
Björn W. Schuller
23
11
0
26 Oct 2022
Geometric Knowledge Distillation: Topology Compression for Graph Neural
  Networks
Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks
Chenxiao Yang
Qitian Wu
Junchi Yan
21
26
0
24 Oct 2022
Learning Generalizable Models for Vehicle Routing Problems via Knowledge
  Distillation
Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation
Jieyi Bi
Yining Ma
Jiahai Wang
Zhiguang Cao
Jinbiao Chen
Yuan Sun
Yeow Meng Chee
36
55
0
14 Oct 2022
Using Knowledge Distillation to improve interpretable models in a retail
  banking context
Using Knowledge Distillation to improve interpretable models in a retail banking context
Maxime Biehler
Mohamed Guermazi
Célim Starck
62
2
0
30 Sep 2022
Self-Distillation for Further Pre-training of Transformers
Self-Distillation for Further Pre-training of Transformers
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
Kenji Kawaguchi
47
8
0
30 Sep 2022
Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with
  Extremely Limited Labels
Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels
Chenyu You
Weicheng Dai
Fenglin Liu
Yifei Min
Haoran Su
Xiaoran Zhang
Xiaoxiao Li
David A. Clifton
Lawrence H. Staib
James S. Duncan
51
50
0
27 Sep 2022
Towards Federated Learning against Noisy Labels via Local
  Self-Regularization
Towards Federated Learning against Noisy Labels via Local Self-Regularization
Xue Jiang
Sheng Sun
Yuwei Wang
Min Liu
27
37
0
25 Aug 2022
Bilateral Self-unbiased Learning from Biased Implicit Feedback
Bilateral Self-unbiased Learning from Biased Implicit Feedback
Jae-woong Lee
Seongmin Park
Joonseok Lee
Jongwuk Lee
CML
16
12
0
26 Jul 2022
An Empirical Study of Implicit Regularization in Deep Offline RL
An Empirical Study of Implicit Regularization in Deep Offline RL
Çağlar Gülçehre
Srivatsan Srinivasan
Jakub Sygnowski
Georg Ostrovski
Mehrdad Farajtabar
Matt Hoffman
Razvan Pascanu
Arnaud Doucet
OffRL
14
16
0
05 Jul 2022
Revisiting Self-Distillation
Revisiting Self-Distillation
M. Pham
Minsu Cho
Ameya Joshi
C. Hegde
23
22
0
17 Jun 2022
Toward Student-Oriented Teacher Network Training For Knowledge
  Distillation
Toward Student-Oriented Teacher Network Training For Knowledge Distillation
Chengyu Dong
Liyuan Liu
Jingbo Shang
46
6
0
14 Jun 2022
Overcoming the Spectral Bias of Neural Value Approximation
Overcoming the Spectral Bias of Neural Value Approximation
Ge Yang
Anurag Ajay
Pulkit Agrawal
34
25
0
09 Jun 2022
Vanilla Feature Distillation for Improving the Accuracy-Robustness
  Trade-Off in Adversarial Training
Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training
Guodong Cao
Zhibo Wang
Xiaowei Dong
Zhifei Zhang
Hengchang Guo
Zhan Qin
Kui Ren
AAML
30
1
0
05 Jun 2022
Contrastive Learning for Improving ASR Robustness in Spoken Language
  Understanding
Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding
Yanfeng Chang
Yun-Nung Chen
28
9
0
02 May 2022
Understanding and Preventing Capacity Loss in Reinforcement Learning
Understanding and Preventing Capacity Loss in Reinforcement Learning
Clare Lyle
Mark Rowland
Will Dabney
CLL
36
109
0
20 Apr 2022
Fast and Memory-Efficient Network Towards Efficient Image
  Super-Resolution
Fast and Memory-Efficient Network Towards Efficient Image Super-Resolution
Zongcai Du
Ding Liu
Jie Liu
Jie Tang
Gangshan Wu
Lean Fu
SupR
32
54
0
18 Apr 2022
Localization Distillation for Object Detection
Localization Distillation for Object Detection
Zhaohui Zheng
Rongguang Ye
Ping Wang
Dongwei Ren
Jun Wang
W. Zuo
Ming-Ming Cheng
27
64
0
12 Apr 2022
Bimodal Distributed Binarized Neural Networks
Bimodal Distributed Binarized Neural Networks
T. Rozen
Moshe Kimhi
Brian Chmiel
A. Mendelson
Chaim Baskin
MQ
53
4
0
05 Apr 2022
Knowledge Distillation: Bad Models Can Be Good Role Models
Knowledge Distillation: Bad Models Can Be Good Role Models
Gal Kaplun
Eran Malach
Preetum Nakkiran
Shai Shalev-Shwartz
FedML
20
15
0
28 Mar 2022
Demystifying the Neural Tangent Kernel from a Practical Perspective: Can
  it be trusted for Neural Architecture Search without training?
Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?
J. Mok
Byunggook Na
Ji-Hoon Kim
Dongyoon Han
Sungroh Yoon
AAML
42
23
0
28 Mar 2022
Knowledge Distillation with the Reused Teacher Classifier
Knowledge Distillation with the Reused Teacher Classifier
Defang Chen
Jianhan Mei
Hailin Zhang
C. Wang
Yan Feng
Chun-Yen Chen
36
166
0
26 Mar 2022
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image
  Translation
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation
Linfeng Zhang
Xin Chen
Xiaobing Tu
Pengfei Wan
N. Xu
Kaisheng Ma
16
62
0
12 Mar 2022
A Semi-supervised Learning Approach with Two Teachers to Improve
  Breakdown Identification in Dialogues
A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues
Qian Lin
Hwee Tou Ng
27
4
0
22 Feb 2022
Sharpness-Aware Minimization with Dynamic Reweighting
Sharpness-Aware Minimization with Dynamic Reweighting
Wenxuan Zhou
Fangyu Liu
Huan Zhang
Muhao Chen
AAML
19
8
0
16 Dec 2021
Learning Curves for Continual Learning in Neural Networks:
  Self-Knowledge Transfer and Forgetting
Learning Curves for Continual Learning in Neural Networks: Self-Knowledge Transfer and Forgetting
Ryo Karakida
S. Akaho
CLL
32
11
0
03 Dec 2021
Nonparametric Topological Layers in Neural Networks
Nonparametric Topological Layers in Neural Networks
Dongfang Zhao
15
0
0
27 Nov 2021
Self-Distilled Self-Supervised Representation Learning
Self-Distilled Self-Supervised Representation Learning
J. Jang
Seonhoon Kim
Kiyoon Yoo
Chaerin Kong
Jang-Hyun Kim
Nojun Kwak
SSL
28
14
0
25 Nov 2021
Multi-label Iterated Learning for Image Classification with Label
  Ambiguity
Multi-label Iterated Learning for Image Classification with Label Ambiguity
Sai Rajeswar
Pau Rodríguez López
Soumye Singhal
David Vazquez
Rameswar Panda
VLM
26
30
0
23 Nov 2021
Domain-Agnostic Clustering with Self-Distillation
Domain-Agnostic Clustering with Self-Distillation
Mohammed Adnan
Yani Andrew Ioannou
Chuan-Yung Tsai
Graham W. Taylor
FedML
SSL
OOD
14
2
0
23 Nov 2021
Towards Model Agnostic Federated Learning Using Knowledge Distillation
Towards Model Agnostic Federated Learning Using Knowledge Distillation
A. Afonin
Sai Praneeth Karimireddy
FedML
30
45
0
28 Oct 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
133
98
0
16 Oct 2021
Previous
123
Next