ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.10822
  4. Cited By
Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation

Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation

16 May 2025
Reilly Haskins
Benjamin Adams
Author Contacts:
reilly.haskins@pg.canterbury.ac.nzbenjamin.adams@canterbury.ac.nz
ArXiv (abs)PDFHTML

Papers citing "Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation"

16 / 16 papers shown
Title
Towards Understanding Distilled Reasoning Models: A Representational Approach
Towards Understanding Distilled Reasoning Models: A Representational Approach
David D. Baek
Max Tegmark
LRM
90
4
0
05 Mar 2025
Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis
Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis
Xiang Wang
Yan Hu
Wenyu Du
Reynold Cheng
Benyou Wang
Difan Zou
104
3
0
17 Feb 2025
Mechanistic Interpretability for AI Safety -- A Review
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
108
151
0
22 Apr 2024
Towards Understanding and Improving Knowledge Distillation for Neural
  Machine Translation
Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation
Songming Zhang
Yunlong Liang
Shuaibo Wang
Wenjuan Han
Jian Liu
Jinan Xu
Jinan Xu
49
10
0
14 May 2023
Measuring the Mixing of Contextual Information in the Transformer
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
87
55
0
08 Mar 2022
Locating and Editing Factual Associations in GPT
Locating and Editing Factual Associations in GPT
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
248
1,357
0
10 Feb 2022
Does Knowledge Distillation Really Work?
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
69
220
0
10 Jun 2021
Towards Understanding Knowledge Distillation
Towards Understanding Knowledge Distillation
Mary Phuong
Christoph H. Lampert
65
321
0
27 May 2021
Pretrained Transformers Improve Out-of-Distribution Robustness
Pretrained Transformers Improve Out-of-Distribution Robustness
Dan Hendrycks
Xiaoyuan Liu
Eric Wallace
Adam Dziedzic
R. Krishnan
Basel Alomair
OOD
191
434
0
13 Apr 2020
Contrastive Representation Distillation
Contrastive Representation Distillation
Yonglong Tian
Dilip Krishnan
Phillip Isola
151
1,049
0
23 Oct 2019
Knowledge Distillation from Internal Representations
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
80
182
0
08 Oct 2019
On the Efficacy of Knowledge Distillation
On the Efficacy of Knowledge Distillation
Ligang He
Rui Mao
94
609
0
03 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
234
7,520
0
02 Oct 2019
TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
107
1,860
0
23 Sep 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
94,891
0
11 Oct 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
704
131,652
0
12 Jun 2017
1