Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1312.6184
Cited By
Do Deep Nets Really Need to be Deep?
21 December 2013
Lei Jimmy Ba
R. Caruana
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Do Deep Nets Really Need to be Deep?"
50 / 379 papers shown
Title
Compressing Transformer-based self-supervised models for speech processing
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
40
6
0
17 Nov 2022
Partial Binarization of Neural Networks for Budget-Aware Efficient Learning
Udbhav Bamba
Neeraj Anand
Saksham Aggarwal
Dilip K Prasad
D. K. Gupta
MQ
26
0
0
12 Nov 2022
Avoid Overthinking in Self-Supervised Models for Speech Recognition
Dan Berrebbi
Brian Yan
Shinji Watanabe
LRM
26
4
0
01 Nov 2022
Pixel-Wise Contrastive Distillation
Junqiang Huang
Zichao Guo
42
4
0
01 Nov 2022
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
Shira Guskin
Moshe Wasserblat
Chang Wang
Haihao Shen
MQ
19
2
0
31 Oct 2022
Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks
Cuong Pham
Tuan Hoang
Thanh-Toan Do
FedML
MQ
34
14
0
27 Oct 2022
Large Language Models Can Self-Improve
Jiaxin Huang
S. Gu
Le Hou
Yuexin Wu
Xuezhi Wang
Hongkun Yu
Jiawei Han
ReLM
AI4MH
LRM
47
568
0
20 Oct 2022
IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors
Sheng Xu
Yanjing Li
Bo-Wen Zeng
Teli Ma
Baochang Zhang
Xianbin Cao
Penglei Gao
Jinhu Lv
30
15
0
07 Oct 2022
Meta-Ensemble Parameter Learning
Zhengcong Fei
Shuman Tian
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
OOD
44
2
0
05 Oct 2022
Using Knowledge Distillation to improve interpretable models in a retail banking context
Maxime Biehler
Mohamed Guermazi
Célim Starck
62
2
0
30 Sep 2022
Compressed Gastric Image Generation Based on Soft-Label Dataset Distillation for Medical Data Sharing
Guang Li
Ren Togo
Takahiro Ogawa
Miki Haseyama
DD
32
40
0
29 Sep 2022
MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference
Mu Yuan
Lan Zhang
Zimu Zheng
Yi-Nan Zhang
Xiang-Yang Li
25
2
0
28 Sep 2022
Efficient Few-Shot Learning Without Prompts
Lewis Tunstall
Nils Reimers
Unso Eun Seo Jo
Luke Bates
Daniel Korat
Moshe Wasserblat
Oren Pereg
VLM
36
183
0
22 Sep 2022
Semi-Supervised and Unsupervised Deep Visual Learning: A Survey
Yanbei Chen
Massimiliano Mancini
Xiatian Zhu
Zeynep Akata
45
114
0
24 Aug 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey
Dalin Zhang
Kaixuan Chen
Yan Zhao
B. Yang
Li-Ping Yao
Christian S. Jensen
48
3
0
22 Aug 2022
Effectiveness of Function Matching in Driving Scene Recognition
Shingo Yashima
26
1
0
20 Aug 2022
Causality-Inspired Taxonomy for Explainable Artificial Intelligence
Pedro C. Neto
Tiago B. Gonccalves
João Ribeiro Pinto
W. Silva
Ana F. Sequeira
Arun Ross
Jaime S. Cardoso
XAI
43
12
0
19 Aug 2022
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment
Jie Zhu
Leye Wang
Xiao Han
28
9
0
11 Aug 2022
ProSelfLC: Progressive Self Label Correction Towards A Low-Temperature Entropy State
Xinshao Wang
Yang Hua
Elyor Kodirov
S. Mukherjee
David A. Clifton
N. Robertson
30
6
0
30 Jun 2022
Knowledge Distillation of Transformer-based Language Models Revisited
Chengqiang Lu
Jianwei Zhang
Yunfei Chu
Zhengyu Chen
Jingren Zhou
Fei Wu
Haiqing Chen
Hongxia Yang
VLM
27
10
0
29 Jun 2022
Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks
Zhiwei Bai
Tao Luo
Z. Xu
Yaoyu Zhang
31
5
0
26 May 2022
Improving the Latent Space of Image Style Transfer
Yun-Hao Bai
Cairong Wang
C. Yuan
Yanbo Fan
Jue Wang
DRL
37
0
0
24 May 2022
Knowledge Distillation via the Target-aware Transformer
Sihao Lin
Hongwei Xie
Bing Wang
Kaicheng Yu
Xiaojun Chang
Xiaodan Liang
G. Wang
ViT
20
104
0
22 May 2022
A Closer Look at Branch Classifiers of Multi-exit Architectures
Shaohui Lin
Bo Ji
Rongrong Ji
Angela Yao
14
4
0
28 Apr 2022
HRPose: Real-Time High-Resolution 6D Pose Estimation Network Using Knowledge Distillation
Qingze Guan
Zihao Sheng
Shibei Xue
3DH
21
15
0
20 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation
Minsoo Kang
Jaeyoo Park
Bohyung Han
CLL
27
179
0
02 Apr 2022
R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
Huan Wang
Jian Ren
Zeng Huang
Kyle Olszewski
Menglei Chai
Yun Fu
Sergey Tulyakov
42
80
0
31 Mar 2022
Knowledge Distillation with the Reused Teacher Classifier
Defang Chen
Jianhan Mei
Hailin Zhang
C. Wang
Yan Feng
Chun-Yen Chen
36
167
0
26 Mar 2022
Efficient Sub-structured Knowledge Distillation
Wenye Lin
Yangming Li
Lemao Liu
Shuming Shi
Haitao Zheng
14
1
0
09 Mar 2022
The rise of the lottery heroes: why zero-shot pruning is hard
Enzo Tartaglione
29
6
0
24 Feb 2022
Distilled Neural Networks for Efficient Learning to Rank
F. M. Nardini
Cosimo Rulli
Salvatore Trani
Rossano Venturini
FedML
29
16
0
22 Feb 2022
Submodlib: A Submodular Optimization Library
Vishal Kaushal
Ganesh Ramakrishnan
Rishabh K. Iyer
50
12
0
22 Feb 2022
Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning
Kexue Fu
Peng Gao
Renrui Zhang
Hongsheng Li
Yu Qiao
Manning Wang
SSL
3DPC
28
23
0
09 Feb 2022
Learning Representation from Neural Fisher Kernel with Low-rank Approximation
Ruixiang Zhang
Shuangfei Zhai
Etai Littwin
J. Susskind
SSL
36
3
0
04 Feb 2022
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
32
7
0
02 Feb 2022
Recycling Model Updates in Federated Learning: Are Gradient Subspaces Low-Rank?
Sheikh Shams Azam
Seyyedali Hosseinalipour
Qiang Qiu
Christopher G. Brinton
FedML
26
20
0
01 Feb 2022
Training Thinner and Deeper Neural Networks: Jumpstart Regularization
Carles Roger Riera Molina
Camilo Rey
Thiago Serra
Eloi Puertas
O. Pujol
27
4
0
30 Jan 2022
Dynamic Rectification Knowledge Distillation
Fahad Rahman Amik
Ahnaf Ismat Tasin
Silvia Ahmed
M. M. L. Elahi
Nabeel Mohammed
28
5
0
27 Jan 2022
Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer
Kaiqi Zhao
Yitao Chen
Ming Zhao
27
3
0
22 Jan 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Yoshitomo Matsubara
Luca Soldaini
Eric Lind
Alessandro Moschitti
29
6
0
15 Jan 2022
Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix
Peng Liu
19
0
0
21 Dec 2021
Driver Drowsiness Detection Using Ensemble Convolutional Neural Networks on YawDD
Rais Mohammad Salman
Mahbubur Rashid
Rupal Roy
M. Ahsan
Zahed Siddique
19
11
0
20 Dec 2021
An Experimental Study of the Impact of Pre-training on the Pruning of a Convolutional Neural Network
Nathan Hubens
M. Mancas
B. Gosselin
Marius Preda
T. Zaharia
VLM
CVBM
23
8
0
15 Dec 2021
Exploring Category-correlated Feature for Few-shot Image Classification
Jing Xu
Xinglin Pan
Xu Luo
Wenjie Pei
Zenglin Xu
27
4
0
14 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
Yuki M. Asano
Aaqib Saeed
43
7
0
01 Dec 2021
Improving Deep Learning Interpretability by Saliency Guided Training
Aya Abdelsalam Ismail
H. C. Bravo
S. Feizi
FAtt
25
80
0
29 Nov 2021
Multi-label Iterated Learning for Image Classification with Label Ambiguity
Sai Rajeswar
Pau Rodríguez López
Soumye Singhal
David Vazquez
Rameswar Panda
VLM
28
30
0
23 Nov 2021
Teacher-Student Training and Triplet Loss to Reduce the Effect of Drastic Face Occlusion
Mariana-Iuliana Georgescu
Georgian-Emilian Duta
Radu Tudor Ionescu
3DH
CVBM
30
19
0
20 Nov 2021
Meta-Teacher For Face Anti-Spoofing
Yunxiao Qin
Zitong Yu
Longbin Yan
Zezheng Wang
Chenxu Zhao
Zhen Lei
CVBM
25
61
0
12 Nov 2021
Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models
J. Yoon
H. Kim
Hyeon Seung Lee
Sunghwan Ahn
N. Kim
38
1
0
05 Nov 2021
Previous
1
2
3
4
5
6
7
8
Next