Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.10822
Cited By
Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation
16 May 2025
Reilly Haskins
Benjamin Adams
Author Contacts:
reilly.haskins@pg.canterbury.ac.nz
benjamin.adams@canterbury.ac.nz
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Distilled Circuits: A Mechanistic Study of Internal Restructuring in Knowledge Distillation"
16 / 16 papers shown
Title
Towards Understanding Distilled Reasoning Models: A Representational Approach
David D. Baek
Max Tegmark
LRM
90
4
0
05 Mar 2025
Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis
Xiang Wang
Yan Hu
Wenyu Du
Reynold Cheng
Benyou Wang
Difan Zou
104
3
0
17 Feb 2025
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
108
151
0
22 Apr 2024
Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation
Songming Zhang
Yunlong Liang
Shuaibo Wang
Wenjuan Han
Jian Liu
Jinan Xu
Jinan Xu
49
10
0
14 May 2023
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
87
55
0
08 Mar 2022
Locating and Editing Factual Associations in GPT
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
248
1,357
0
10 Feb 2022
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
69
220
0
10 Jun 2021
Towards Understanding Knowledge Distillation
Mary Phuong
Christoph H. Lampert
65
321
0
27 May 2021
Pretrained Transformers Improve Out-of-Distribution Robustness
Dan Hendrycks
Xiaoyuan Liu
Eric Wallace
Adam Dziedzic
R. Krishnan
Basel Alomair
OOD
191
434
0
13 Apr 2020
Contrastive Representation Distillation
Yonglong Tian
Dilip Krishnan
Phillip Isola
151
1,049
0
23 Oct 2019
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
80
182
0
08 Oct 2019
On the Efficacy of Knowledge Distillation
Ligang He
Rui Mao
94
609
0
03 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
234
7,520
0
02 Oct 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
107
1,860
0
23 Sep 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
704
131,652
0
12 Jun 2017
1