Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1312.6184
Cited By
Do Deep Nets Really Need to be Deep?
21 December 2013
Lei Jimmy Ba
R. Caruana
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Do Deep Nets Really Need to be Deep?"
50 / 379 papers shown
Title
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
57
0
0
05 May 2025
Transfer Learning with Pre-trained Conditional Generative Models
Shin'ya Yamaguchi
Sekitoshi Kanai
Atsutoshi Kumagai
Daiki Chijiwa
H. Kashima
VLM
CLL
BDL
DiffM
148
5
0
21 Feb 2025
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing
Kou Misaki
Han Bao
Sho Yokoi
Takuya Akiba
VLM
57
1
0
28 Jan 2025
Learning an Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
You Wu
Yongxin Li
Mengyuan Liu
Xucheng Wang
Xiangyang Yang
Hengzhou Ye
Dan Zeng
Qijun Zhao
Shuiwang Li
188
0
0
28 Dec 2024
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Yuxiang Lu
Shengcao Cao
Yu-xiong Wang
55
1
0
18 Oct 2024
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
Mike Ranzinger
Jon Barker
Greg Heinrich
Pavlo Molchanov
Bryan Catanzaro
Andrew Tao
45
5
0
02 Oct 2024
Distilling System 2 into System 1
Ping Yu
Jing Xu
Jason Weston
Ilia Kulikov
OffRL
LRM
52
62
0
08 Jul 2024
A Label is Worth a Thousand Images in Dataset Distillation
Tian Qin
Zhiwei Deng
David Alvarez-Melis
DD
94
12
0
15 Jun 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
Jordy Van Landeghem
Subhajit Maity
Ayan Banerjee
Matthew Blaschko
Marie-Francine Moens
Josep Lladós
Sanket Biswas
50
2
0
12 Jun 2024
Teaching MLP More Graph Information: A Three-stage Multitask Knowledge Distillation Framework
Junxian Li
Bin Shi
Erfei Cui
Hua Wei
Qinghua Zheng
51
0
0
02 Mar 2024
m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers
Ka Man Lo
Yiming Liang
Wenyu Du
Yuantao Fan
Zili Wang
Wenhao Huang
Lei Ma
Jie Fu
MoE
42
2
0
26 Feb 2024
GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
Ayan Banerjee
Sanket Biswas
Josep Lladós
Umapada Pal
46
2
0
17 Feb 2024
Logit Poisoning Attack in Distillation-based Federated Learning and its Countermeasures
Yonghao Yu
Shunan Zhu
Jinglu Hu
AAML
FedML
35
0
0
31 Jan 2024
Mutual Distillation Learning For Person Re-Identification
Huiyuan Fu
Kuilong Cui
Chuanming Wang
Mengshi Qi
Huadong Ma
40
0
0
12 Jan 2024
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Pratyusha Sharma
Jordan T. Ash
Dipendra Kumar Misra
LRM
19
79
0
21 Dec 2023
Mixed Distillation Helps Smaller Language Model Better Reasoning
Chenglin Li
Qianglong Chen
Liangyue Li
Wang Caiyu
Yicheng Li
Zhang Yin
Yin Zhang
LRM
41
12
0
17 Dec 2023
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One
Michael Ranzinger
Greg Heinrich
Jan Kautz
Pavlo Molchanov
VLM
44
42
0
10 Dec 2023
DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency
Azhar Shaikh
Michael Cochez
Denis Diachkov
Michiel de Rijcke
Sahar Yousefi
35
0
0
09 Nov 2023
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Rui Xu
Wenkang Qin
Peixiang Huang
Hao Wang
Lin Luo
FAtt
AAML
38
2
0
09 Nov 2023
ViPE: Visualise Pretty-much Everything
Hassan Shahmohammadi
Adhiraj Ghosh
Hendrik P. A. Lensch
DiffM
31
1
0
16 Oct 2023
Foreground Object Search by Distilling Composite Image Feature
Bo Zhang
Jiacheng Sui
Li Niu
30
5
0
09 Aug 2023
Exploring the Lottery Ticket Hypothesis with Explainability Methods: Insights into Sparse Network Performance
Shantanu Ghosh
Kayhan Batmanghelich
30
0
0
07 Jul 2023
Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
Nicolae-Cătălin Ristea
Florinel-Alin Croitoru
Radu Tudor Ionescu
Marius Popescu
Fahad Shahbaz Khan
M. Shah
ViT
26
20
0
21 Jun 2023
Interpretable Differencing of Machine Learning Models
Swagatam Haldar
Diptikalyan Saha
Dennis L. Wei
Rahul Nair
Elizabeth M. Daly
16
1
0
10 Jun 2023
ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models
Suzanna Parkinson
Greg Ongie
Rebecca Willett
68
6
0
24 May 2023
Learning Interpretable Style Embeddings via Prompting LLMs
Ajay Patel
D. Rao
Ansh Kothary
Kathleen McKeown
Chris Callison-Burch
37
24
0
22 May 2023
Analyzing Compression Techniques for Computer Vision
Maniratnam Mandal
Imran Khan
27
1
0
14 May 2023
Similarity of Neural Network Models: A Survey of Functional and Representational Measures
Max Klabunde
Tobias Schumacher
M. Strohmaier
Florian Lemmerich
58
66
0
10 May 2023
Target-Side Augmentation for Document-Level Machine Translation
Guangsheng Bao
Zhiyang Teng
Yue Zhang
49
10
0
08 May 2023
Madvex: Instrumentation-based Adversarial Attacks on Machine Learning Malware Detection
Yang Cai
Felix Mächtle
C. Daskalakis
Volodymyr Bezsmertnyi
T. Eisenbarth
AAML
31
7
0
04 May 2023
A Survey on Solving and Discovering Differential Equations Using Deep Neural Networks
Hyeonjung Jung
Jung
Jayant Gupta
B. Jayaprakash
Matthew J. Eagon
Harish Selvam
Carl Molnar
W. Northrop
Shashi Shekhar
AI4CE
35
5
0
26 Apr 2023
PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized Perception with Neural Sensors
Haley M. So
Laurie Bose
Piotr Dudek
Gordon Wetzstein
23
4
0
11 Apr 2023
Self-Distillation for Gaussian Process Regression and Classification
Kenneth Borup
L. Andersen
13
2
0
05 Apr 2023
DIME-FM: DIstilling Multimodal and Efficient Foundation Models
Ximeng Sun
Pengchuan Zhang
Peizhao Zhang
Hardik Shah
Kate Saenko
Xide Xia
VLM
25
20
0
31 Mar 2023
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
Ge Li
Hasan Hammoud
Hani Itani
Dmitrii Khizbullin
Guohao Li
SyDa
ALM
50
413
0
31 Mar 2023
Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
Aneeshan Sain
A. Bhunia
Subhadeep Koley
Pinaki Nath Chowdhury
Soumitri Chattopadhyay
Tao Xiang
Yi-Zhe Song
28
18
0
24 Mar 2023
MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation
Vitaliy Kinakh
M. Drozdova
Slava Voloshynovskiy
40
1
0
21 Mar 2023
Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models
Aashka Trivedi
Takuma Udagawa
Michele Merler
Yikang Shen
Yousef El-Kurdi
Bishwaranjan Bhattacharjee
35
7
0
16 Mar 2023
Students Parrot Their Teachers: Membership Inference on Model Distillation
Matthew Jagielski
Milad Nasr
Christopher A. Choquette-Choo
Katherine Lee
Nicholas Carlini
FedML
41
21
0
06 Mar 2023
Distilling Calibrated Student from an Uncalibrated Teacher
Ishan Mishra
Sethu Vamsi Krishna
Deepak Mishra
FedML
40
2
0
22 Feb 2023
Cross Modal Distillation for Flood Extent Mapping
Shubhika Garg
Ben Feinstein
Shahar Timnat
Vishal Batchu
G. Dror
Adi Gerzi Rosenthal
Varun Gulshan
33
12
0
16 Feb 2023
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
Rahul Parhi
Robert D. Nowak
43
25
0
23 Jan 2023
Dataset Distillation: A Comprehensive Review
Ruonan Yu
Songhua Liu
Xinchao Wang
DD
55
121
0
17 Jan 2023
In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision
Gourav Datta
Zeyu Liu
Md. Abdullah-Al Kaiser
Souvik Kundu
Joe Mathai
Zihan Yin
Ajey P. Jacob
Akhilesh R. Jaiswal
P. Beerel
26
12
0
21 Dec 2022
Gait Recognition Using 3-D Human Body Shape Inference
Haidong Zhu
Zhao-Heng Zheng
Ramkant Nevatia
CVBM
3DH
36
23
0
18 Dec 2022
Co-training
2
L
2^L
2
L
Submodels for Visual Recognition
Hugo Touvron
Matthieu Cord
Maxime Oquab
Piotr Bojanowski
Jakob Verbeek
Hervé Jégou
VLM
37
9
0
09 Dec 2022
Investigating certain choices of CNN configurations for brain lesion segmentation
Masoomeh Rahimpour
A. Radwan
Henri Vandermeulen
S. Sunaert
K. Goffin
M. Koole
50
1
0
02 Dec 2022
Distilling Reasoning Capabilities into Smaller Language Models
Kumar Shridhar
Alessandro Stolfo
Mrinmaya Sachan
LRM
ReLM
25
158
0
01 Dec 2022
Decentralized Learning with Multi-Headed Distillation
A. Zhmoginov
Mark Sandler
Nolan Miller
Gus Kristiansen
Max Vladymyrov
FedML
40
4
0
28 Nov 2022
Lightning Fast Video Anomaly Detection via Adversarial Knowledge Distillation
Florinel-Alin Croitoru
Nicolae-Cătălin Ristea
D. Dascalescu
Radu Tudor Ionescu
Fahad Shahbaz Khan
M. Shah
43
2
0
28 Nov 2022
1
2
3
4
5
6
7
8
Next