ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.08170
  4. Cited By
MoKD: Multi-Task Optimization for Knowledge Distillation

MoKD: Multi-Task Optimization for Knowledge Distillation

13 May 2025
Zeeshan Hayder
A. Cheraghian
Lars Petersson
Mehrtash Harandi
    VLM
ArXiv (abs)PDFHTML

Papers citing "MoKD: Multi-Task Optimization for Knowledge Distillation"

21 / 21 papers shown
Title
Knowledge Distillation Based on Transformed Teacher Matching
Knowledge Distillation Based on Transformed Teacher Matching
Kaixiang Zheng
En-Hui Yang
87
21
0
17 Feb 2024
DeiT III: Revenge of the ViT
DeiT III: Revenge of the ViT
Hugo Touvron
Matthieu Cord
Hervé Jégou
ViT
129
417
0
14 Apr 2022
Decoupled Knowledge Distillation
Decoupled Knowledge Distillation
Borui Zhao
Quan Cui
Renjie Song
Yiyu Qiu
Jiajun Liang
82
548
0
16 Mar 2022
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised
  Knowledge Distillation
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation
K. Navaneet
Soroush Abbasi Koohpayegani
Ajinkya Tejankar
Hamed Pirsiavash
51
20
0
13 Jan 2022
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
Hwanjun Song
Deqing Sun
Sanghyuk Chun
Varun Jampani
Dongyoon Han
Byeongho Heo
Wonjae Kim
Ming-Hsuan Yang
144
77
0
08 Oct 2021
Hierarchical Self-supervised Augmented Knowledge Distillation
Hierarchical Self-supervised Augmented Knowledge Distillation
Chuanguang Yang
Zhulin An
Linhang Cai
Yongjun Xu
SSL
68
80
0
29 Jul 2021
Co-advise: Cross Inductive Bias Distillation
Co-advise: Cross Inductive Bias Distillation
Sucheng Ren
Zhengqi Gao
Tianyu Hua
Zihui Xue
Yonglong Tian
Shengfeng He
Hang Zhao
79
52
0
23 Jun 2021
Distilling Knowledge via Knowledge Review
Distilling Knowledge via Knowledge Review
Pengguang Chen
Shu Liu
Hengshuang Zhao
Jiaya Jia
189
446
0
19 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
160
1,021
0
31 Mar 2021
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
389
6,805
0
23 Dec 2020
Wasserstein Contrastive Representation Distillation
Wasserstein Contrastive Representation Distillation
Liqun Chen
Dong Wang
Zhe Gan
Jingjing Liu
Ricardo Henao
Lawrence Carin
79
96
0
15 Dec 2020
TIDE: A General Toolbox for Identifying Object Detection Errors
TIDE: A General Toolbox for Identifying Object Detection Errors
Daniel Bolya
Sean Foley
James Hays
Judy Hoffman
86
195
0
18 Aug 2020
Knowledge Distillation Meets Self-Supervision
Knowledge Distillation Meets Self-Supervision
Guodong Xu
Ziwei Liu
Xiaoxiao Li
Chen Change Loy
FedML
82
285
0
12 Jun 2020
Channel Distillation: Channel-Wise Attention for Knowledge Distillation
Channel Distillation: Channel-Wise Attention for Knowledge Distillation
Zaida Zhou
Chaoran Zhuge
Xinwei Guan
Wen Liu
69
50
0
02 Jun 2020
Designing Network Design Spaces
Designing Network Design Spaces
Ilija Radosavovic
Raj Prateek Kosaraju
Ross B. Girshick
Kaiming He
Piotr Dollár
GNN
102
1,693
0
30 Mar 2020
Gradient Surgery for Multi-Task Learning
Gradient Surgery for Multi-Task Learning
Tianhe Yu
Saurabh Kumar
Abhishek Gupta
Sergey Levine
Karol Hausman
Chelsea Finn
184
1,228
0
19 Jan 2020
A Comprehensive Overhaul of Feature Distillation
A Comprehensive Overhaul of Feature Distillation
Byeongho Heo
Jeesoo Kim
Sangdoo Yun
Hyojin Park
Nojun Kwak
J. Choi
86
584
0
03 Apr 2019
Deep Mutual Learning
Deep Mutual Learning
Ying Zhang
Tao Xiang
Timothy M. Hospedales
Huchuan Lu
FedML
155
1,656
0
01 Jun 2017
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry
  and Semantics
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Alex Kendall
Y. Gal
R. Cipolla
3DH
272
3,136
0
19 May 2017
FitNets: Hints for Thin Deep Nets
FitNets: Hints for Thin Deep Nets
Adriana Romero
Nicolas Ballas
Samira Ebrahimi Kahou
Antoine Chassang
C. Gatta
Yoshua Bengio
FedML
319
3,899
0
19 Dec 2014
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
434
43,832
0
01 May 2014
1