ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1610.09650
  4. Cited By
Deep Model Compression: Distilling Knowledge from Noisy Teachers

Deep Model Compression: Distilling Knowledge from Noisy Teachers

30 October 2016
Bharat Bhusan Sau
V. Balasubramanian
ArXivPDFHTML

Papers citing "Deep Model Compression: Distilling Knowledge from Noisy Teachers"

50 / 75 papers shown
Title
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
240
0
0
20 Mar 2025
Distillation of Diffusion Features for Semantic Correspondence
Distillation of Diffusion Features for Semantic Correspondence
Frank Fundel
Johannes Schusterbauer
Vincent Tao Hu
Bjorn Ommer
DiffM
93
3
0
04 Dec 2024
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Yuxiang Lu
Shengcao Cao
Yu-xiong Wang
57
1
0
18 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different
  Initializations and Tasks
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
52
1
0
02 Oct 2024
AdaKD: Dynamic Knowledge Distillation of ASR models using Adaptive Loss
  Weighting
AdaKD: Dynamic Knowledge Distillation of ASR models using Adaptive Loss Weighting
Shreyan Ganguly
Roshan Nayak
Rakshith Rao
Ujan Deb
AP Prathosh
32
1
0
11 May 2024
Two-Stage Multi-task Self-Supervised Learning for Medical Image
  Segmentation
Two-Stage Multi-task Self-Supervised Learning for Medical Image Segmentation
Binyan Hu
•. A. K. Qin
SSL
19
0
0
11 Feb 2024
Deeper or Wider: A Perspective from Optimal Generalization Error with
  Sobolev Loss
Deeper or Wider: A Perspective from Optimal Generalization Error with Sobolev Loss
Yahong Yang
Juncai He
AI4CE
40
7
0
31 Jan 2024
Optimal Deep Neural Network Approximation for Korobov Functions with
  respect to Sobolev Norms
Optimal Deep Neural Network Approximation for Korobov Functions with respect to Sobolev Norms
Yahong Yang
Yulong Lu
40
3
0
08 Nov 2023
Knowledge Distillation for Anomaly Detection
Knowledge Distillation for Anomaly Detection
Adrian Alan Pol
E. Govorkova
Sonja Grönroos
N. Chernyavskaya
Philip C. Harris
M. Pierini
I. Ojalvo
P. Elmer
27
1
0
09 Oct 2023
The Quest of Finding the Antidote to Sparse Double Descent
The Quest of Finding the Antidote to Sparse Double Descent
Victor Quétu
Marta Milovanović
34
0
0
31 Aug 2023
Learning Lightweight Object Detectors via Multi-Teacher Progressive
  Distillation
Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation
Shengcao Cao
Mengtian Li
James Hays
Deva Ramanan
Yu-xiong Wang
Liangyan Gui
VLM
26
11
0
17 Aug 2023
Nearly Optimal VC-Dimension and Pseudo-Dimension Bounds for Deep Neural
  Network Derivatives
Nearly Optimal VC-Dimension and Pseudo-Dimension Bounds for Deep Neural Network Derivatives
Yahong Yang
Haizhao Yang
Yang Xiang
31
19
0
15 May 2023
DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic
  Segmentation
DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation
Fengyi Shen
A. Gurram
Ziyuan Liu
He Wang
Alois Knoll
24
26
0
05 Apr 2023
Knowledge Distillation from Multiple Foundation Models for End-to-End
  Speech Recognition
Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition
Xiaoyu Yang
Qiujia Li
C. Zhang
P. Woodland
36
6
0
20 Mar 2023
Robust Knowledge Distillation from RNN-T Models With Noisy Training
  Labels Using Full-Sum Loss
Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss
Mohammad Zeineldeen
Kartik Audhkhasi
M. Baskar
Bhuvana Ramabhadran
24
2
0
10 Mar 2023
DSD$^2$: Can We Dodge Sparse Double Descent and Compress the Neural
  Network Worry-Free?
DSD2^22: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
32
7
0
02 Mar 2023
Tailor: Altering Skip Connections for Resource-Efficient Inference
Tailor: Altering Skip Connections for Resource-Efficient Inference
Olivia Weng
Gabriel Marcano
Vladimir Loncar
Alireza Khodamoradi
Nojan Sheybani
Andres Meza
F. Koushanfar
K. Denolf
Javier Mauricio Duarte
Ryan Kastner
46
12
0
18 Jan 2023
A General Multiple Data Augmentation Based Framework for Training Deep
  Neural Networks
A General Multiple Data Augmentation Based Framework for Training Deep Neural Networks
Bin Hu
Yu Sun
•. A. K. Qin
AI4CE
36
0
0
29 May 2022
Generalized Knowledge Distillation via Relationship Matching
Generalized Knowledge Distillation via Relationship Matching
Han-Jia Ye
Su Lu
De-Chuan Zhan
FedML
22
20
0
04 May 2022
A Simple Structure For Building A Robust Model
A Simple Structure For Building A Robust Model
Xiao Tan
Jingbo Gao
Ruolin Li
AAML
OOD
46
3
0
25 Apr 2022
Selective Cross-Task Distillation
Selective Cross-Task Distillation
Su Lu
Han-Jia Ye
De-Chuan Zhan
36
0
0
25 Apr 2022
Deadwooding: Robust Global Pruning for Deep Neural Networks
Deadwooding: Robust Global Pruning for Deep Neural Networks
Sawinder Kaur
Ferdinando Fioretto
Asif Salekin
27
4
0
10 Feb 2022
Dynamic Rectification Knowledge Distillation
Dynamic Rectification Knowledge Distillation
Fahad Rahman Amik
Ahnaf Ismat Tasin
Silvia Ahmed
M. M. L. Elahi
Nabeel Mohammed
31
5
0
27 Jan 2022
Contrastive Neighborhood Alignment
Contrastive Neighborhood Alignment
Pengkai Zhu
Zhaowei Cai
Yuanjun Xiong
Zhuowen Tu
Luis Goncalves
Vijay Mahadevan
Stefano Soatto
18
2
0
06 Jan 2022
Multi-Modality Distillation via Learning the teacher's modality-level
  Gram Matrix
Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix
Peng Liu
19
0
0
21 Dec 2021
Estimating and Maximizing Mutual Information for Knowledge Distillation
Estimating and Maximizing Mutual Information for Knowledge Distillation
A. Shrivastava
Yanjun Qi
Vicente Ordonez
21
5
0
29 Oct 2021
A Studious Approach to Semi-Supervised Learning
A Studious Approach to Semi-Supervised Learning
Sahil Khose
Shruti Jain
V. Manushree
18
0
0
18 Sep 2021
LANA: Latency Aware Network Acceleration
LANA: Latency Aware Network Acceleration
Pavlo Molchanov
Jimmy Hall
Hongxu Yin
Jan Kautz
Nicolò Fusi
Arash Vahdat
25
11
0
12 Jul 2021
Embracing the Dark Knowledge: Domain Generalization Using Regularized
  Knowledge Distillation
Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation
Yufei Wang
Haoliang Li
Lap-pui Chau
Alex C. Kot
FedML
21
40
0
06 Jul 2021
Few-Shot Learning with a Strong Teacher
Few-Shot Learning with a Strong Teacher
Han-Jia Ye
Lu Ming
De-Chuan Zhan
Wei-Lun Chao
19
50
0
01 Jul 2021
Distilling the Knowledge from Conditional Normalizing Flows
Distilling the Knowledge from Conditional Normalizing Flows
Dmitry Baranchuk
Vladimir Aliev
Artem Babenko
BDL
36
2
0
24 Jun 2021
Adaptive Multi-Teacher Multi-level Knowledge Distillation
Adaptive Multi-Teacher Multi-level Knowledge Distillation
Yuang Liu
Wei Zhang
Jun Wang
28
157
0
06 Mar 2021
Distilling Knowledge via Intermediate Classifiers
Distilling Knowledge via Intermediate Classifiers
Aryan Asadian
Amirali Salehi-Abari
38
1
0
28 Feb 2021
ADD: Augmented Disentanglement Distillation Framework for Improving
  Stock Trend Forecasting
ADD: Augmented Disentanglement Distillation Framework for Improving Stock Trend Forecasting
H. Tang
Lijun Wu
Weiqing Liu
Jiang Bian
AIFin
11
4
0
11 Dec 2020
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and
  Quantization
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and Quantization
Het Shah
Avishree Khare
Neelay Shah
Khizir Siddiqui
MQ
13
6
0
30 Nov 2020
Learnable Boundary Guided Adversarial Training
Learnable Boundary Guided Adversarial Training
Jiequan Cui
Shu Liu
Liwei Wang
Jiaya Jia
OOD
AAML
30
124
0
23 Nov 2020
Computing Systems for Autonomous Driving: State-of-the-Art and
  Challenges
Computing Systems for Autonomous Driving: State-of-the-Art and Challenges
Liangkai Liu
Sidi Lu
Ren Zhong
Baofu Wu
Yongtao Yao
Qingyan Zhang
Weisong Shi
27
268
0
30 Sep 2020
Compression of Deep Learning Models for Text: A Survey
Compression of Deep Learning Models for Text: A Survey
Manish Gupta
Puneet Agrawal
VLM
MedIm
AI4CE
22
115
0
12 Aug 2020
ESPN: Extremely Sparse Pruned Networks
ESPN: Extremely Sparse Pruned Networks
Minsu Cho
Ameya Joshi
C. Hegde
19
7
0
28 Jun 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
23
2,851
0
09 Jun 2020
ResKD: Residual-Guided Knowledge Distillation
ResKD: Residual-Guided Knowledge Distillation
Xuewei Li
Songyuan Li
Bourahla Omar
Fei Wu
Xi Li
21
47
0
08 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
Sub-Band Knowledge Distillation Framework for Speech Enhancement
Sub-Band Knowledge Distillation Framework for Speech Enhancement
Xiang Hao
Shi-Xue Wen
Xiangdong Su
Yun Liu
Guanglai Gao
Xiaofei Li
27
18
0
29 May 2020
CHEER: Rich Model Helps Poor Model via Knowledge Infusion
CHEER: Rich Model Helps Poor Model via Knowledge Infusion
Cao Xiao
T. Hoang
linda Qiao
Tengfei Ma
Jimeng Sun
16
3
0
21 May 2020
Edge Intelligence: Architectures, Challenges, and Applications
Edge Intelligence: Architectures, Challenges, and Applications
Dianlei Xu
Tong Li
Yong Li
Xiang Su
Sasu Tarkoma
Tao Jiang
Jon Crowcroft
Pan Hui
50
29
0
26 Mar 2020
Efficient Crowd Counting via Structured Knowledge Transfer
Efficient Crowd Counting via Structured Knowledge Transfer
Lingbo Liu
Jiaqi Chen
Hefeng Wu
Tianshui Chen
Guanbin Li
Liang Lin
29
64
0
23 Mar 2020
MarginDistillation: distillation for margin-based softmax
MarginDistillation: distillation for margin-based softmax
D. Svitov
S. Alyamkin
CVBM
25
9
0
05 Mar 2020
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector
  Elimination
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination
Saurabh Goyal
Anamitra R. Choudhury
Saurabh ManishRaje
Venkatesan T. Chakaravarthy
Yogish Sabharwal
Ashish Verma
26
18
0
24 Jan 2020
Implicit Priors for Knowledge Sharing in Bayesian Neural Networks
Implicit Priors for Knowledge Sharing in Bayesian Neural Networks
Jack K. Fitzsimons
Sebastian M. Schmon
Stephen J. Roberts
BDL
FedML
16
0
0
02 Dec 2019
Preparing Lessons: Improve Knowledge Distillation with Better
  Supervision
Preparing Lessons: Improve Knowledge Distillation with Better Supervision
Tiancheng Wen
Shenqi Lai
Xueming Qian
25
68
0
18 Nov 2019
12
Next