ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.12245
  4. Cited By
Supervision Complexity and its Role in Knowledge Distillation

Supervision Complexity and its Role in Knowledge Distillation

28 January 2023
Hrayr Harutyunyan
A. S. Rawat
A. Menon
Seungyeon Kim
Surinder Kumar
ArXivPDFHTML

Papers citing "Supervision Complexity and its Role in Knowledge Distillation"

13 / 13 papers shown
Title
Efficient Knowledge Distillation via Curriculum Extraction
Efficient Knowledge Distillation via Curriculum Extraction
Shivam Gupta
Sushrut Karmalkar
44
0
0
21 Mar 2025
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging
  Small LMs
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
A. S. Rawat
Veeranjaneyulu Sadhanala
Afshin Rostamizadeh
Ayan Chakrabarti
Wittawat Jitkrittum
...
Rakesh Shivanna
Sashank J. Reddi
A. Menon
Rohan Anil
Sanjiv Kumar
33
2
0
24 Oct 2024
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
M. E. Ildiz
Halil Alperen Gozeten
Ege Onur Taga
Marco Mondelli
Samet Oymak
56
2
0
24 Oct 2024
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by
  Leveraging Second-Order Information
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
Diyuan Wu
Ionut-Vlad Modoranu
M. Safaryan
Denis Kuznedelev
Dan Alistarh
29
1
0
30 Aug 2024
Learning Neural Networks with Sparse Activations
Learning Neural Networks with Sparse Activations
Pranjal Awasthi
Nishanth Dikkala
Pritish Kamath
Raghu Meka
41
2
0
26 Jun 2024
Towards the Fundamental Limits of Knowledge Transfer over Finite Domains
Towards the Fundamental Limits of Knowledge Transfer over Finite Domains
Qingyue Zhao
Banghua Zhu
36
4
0
11 Oct 2023
Data Upcycling Knowledge Distillation for Image Super-Resolution
Data Upcycling Knowledge Distillation for Image Super-Resolution
Yun-feng Zhang
Wei Li
Simiao Li
Hanting Chen
Zhaopeng Tu
Wenjun Wang
Bingyi Jing
Hai-lin Wang
Jie Hu
35
3
0
25 Sep 2023
Cluster-aware Semi-supervised Learning: Relational Knowledge
  Distillation Provably Learns Clustering
Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering
Yijun Dong
Kevin Miller
Qiuyu Lei
Rachel A. Ward
25
4
0
20 Jul 2023
On student-teacher deviations in distillation: does it pay to disobey?
On student-teacher deviations in distillation: does it pay to disobey?
Vaishnavh Nagarajan
A. Menon
Srinadh Bhojanapalli
H. Mobahi
Surinder Kumar
43
9
0
30 Jan 2023
Pro-KD: Progressive Distillation by Following the Footsteps of the
  Teacher
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher
Mehdi Rezagholizadeh
A. Jafari
Puneeth Salad
Pranav Sharma
Ali Saheb Pasand
A. Ghodsi
79
17
0
16 Oct 2021
A linearized framework and a new benchmark for model selection for
  fine-tuning
A linearized framework and a new benchmark for model selection for fine-tuning
Aditya Deshpande
Alessandro Achille
Avinash Ravichandran
Hao Li
L. Zancato
Charless C. Fowlkes
Rahul Bhotika
Stefano Soatto
Pietro Perona
ALM
115
46
0
29 Jan 2021
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data
  Efficiency and Imperfect Teacher
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Guangda Ji
Zhanxing Zhu
59
42
0
20 Oct 2020
Large scale distributed neural network training through online
  distillation
Large scale distributed neural network training through online distillation
Rohan Anil
Gabriel Pereyra
Alexandre Passos
Róbert Ormándi
George E. Dahl
Geoffrey E. Hinton
FedML
278
404
0
09 Apr 2018
1