ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.00555
  4. Cited By
Transferring Inductive Biases through Knowledge Distillation

Transferring Inductive Biases through Knowledge Distillation

31 May 2020
Samira Abnar
Mostafa Dehghani
Willem H. Zuidema
ArXivPDFHTML

Papers citing "Transferring Inductive Biases through Knowledge Distillation"

16 / 16 papers shown
Title
Efficient and Robust Jet Tagging at the LHC with Knowledge Distillation
Efficient and Robust Jet Tagging at the LHC with Knowledge Distillation
Ryan Liu
A. Gandrakota
J. Ngadiuba
M. Spiropulu
J. Vlimant
21
2
0
23 Nov 2023
Adaptivity and Modularity for Efficient Generalization Over Task
  Complexity
Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Samira Abnar
Omid Saremi
Laurent Dinh
Shantel Wilson
Miguel Angel Bautista
...
Vimal Thilak
Etai Littwin
Jiatao Gu
Josh Susskind
Samy Bengio
41
5
0
13 Oct 2023
Knowledge Distillation for Anomaly Detection
Knowledge Distillation for Anomaly Detection
Adrian Alan Pol
E. Govorkova
Sonja Grönroos
N. Chernyavskaya
Philip C. Harris
M. Pierini
I. Ojalvo
P. Elmer
27
1
0
09 Oct 2023
RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer
Jiahao Wang
Songyang Zhang
Yong Liu
Taiqiang Wu
Yujiu Yang
Xihui Liu
Kai-xiang Chen
Ping Luo
Dahua Lin
39
20
0
12 Apr 2023
Distillation from Heterogeneous Models for Top-K Recommendation
Distillation from Heterogeneous Models for Top-K Recommendation
SeongKu Kang
Wonbin Kweon
Dongha Lee
Jianxun Lian
Xing Xie
Hwanjo Yu
VLM
35
21
0
02 Mar 2023
Adaptive Computation with Elastic Input Sequence
Adaptive Computation with Elastic Input Sequence
Fuzhao Xue
Valerii Likhosherstov
Anurag Arnab
N. Houlsby
Mostafa Dehghani
Yang You
31
19
0
30 Jan 2023
Co-training $2^L$ Submodels for Visual Recognition
Co-training 2L2^L2L Submodels for Visual Recognition
Hugo Touvron
Matthieu Cord
Maxime Oquab
Piotr Bojanowski
Jakob Verbeek
Hervé Jégou
VLM
35
9
0
09 Dec 2022
Scaling Laws vs Model Architectures: How does Inductive Bias Influence
  Scaling?
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Yi Tay
Mostafa Dehghani
Samira Abnar
Hyung Won Chung
W. Fedus
J. Rao
Sharan Narang
Vinh Q. Tran
Dani Yogatama
Donald Metzler
AI4CE
34
100
0
21 Jul 2022
DearKD: Data-Efficient Early Knowledge Distillation for Vision
  Transformers
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
Xianing Chen
Qiong Cao
Yujie Zhong
Jing Zhang
Shenghua Gao
Dacheng Tao
ViT
40
76
0
27 Apr 2022
Forward Compatible Training for Large-Scale Embedding Retrieval Systems
Forward Compatible Training for Large-Scale Embedding Retrieval Systems
Vivek Ramanujan
Pavan Kumar Anasosalu Vasu
Ali Farhadi
Oncel Tuzel
Hadi Pouransari
VLM
32
16
0
06 Dec 2021
A Survey of Visual Transformers
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
77
330
0
11 Nov 2021
The Benchmark Lottery
The Benchmark Lottery
Mostafa Dehghani
Yi Tay
A. Gritsenko
Zhe Zhao
N. Houlsby
Fernando Diaz
Donald Metzler
Oriol Vinyals
42
89
0
14 Jul 2021
Efficient Transformers in Reinforcement Learning using Actor-Learner
  Distillation
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation
Emilio Parisotto
Ruslan Salakhutdinov
42
44
0
04 Apr 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive
  Biases
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
58
805
0
19 Mar 2021
Self-Distillation Amplifies Regularization in Hilbert Space
Self-Distillation Amplifies Regularization in Hilbert Space
H. Mobahi
Mehrdad Farajtabar
Peter L. Bartlett
33
226
0
13 Feb 2020
Large scale distributed neural network training through online
  distillation
Large scale distributed neural network training through online distillation
Rohan Anil
Gabriel Pereyra
Alexandre Passos
Róbert Ormándi
George E. Dahl
Geoffrey E. Hinton
FedML
278
404
0
09 Apr 2018
1