Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,481 papers shown
Title
Grassmannian Packings in Neural Networks: Learning with Maximal Subspace Packings for Diversity and Anti-Sparsity
Dian Ang Yap
Nicholas Roberts
Vinay Uday Prabhu
27
2
0
18 Nov 2019
Provable Filter Pruning for Efficient Neural Networks
Lucas Liebenwein
Cenk Baykal
Harry Lang
Dan Feldman
Daniela Rus
VLM
3DPC
105
141
0
18 Nov 2019
S2DNAS:Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search
Zhihang Yuan
Bingzhe Wu
Zheng Liang
Shiwan Zhao
Weichen Bi
Guangyu Sun
77
30
0
16 Nov 2019
ASCAI: Adaptive Sampling for acquiring Compact AI
Mojan Javaheripi
Mohammad Samragh
T. Javidi
F. Koushanfar
44
2
0
15 Nov 2019
Structured Sparsification of Gated Recurrent Neural Networks
E. Lobacheva
Nadezhda Chirkova
Alexander Markovich
Dmitry Vetrov
74
3
0
13 Nov 2019
DupNet: Towards Very Tiny Quantized CNN with Improved Accuracy for Face Detection
Hongxing Gao
Wei Tao
Dongchao Wen
Junjie Liu
Tse-Wei Chen
Kinya Osa
Masami Kato
CVBM
41
5
0
13 Nov 2019
word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement
Ali (Aliakbar) Panahi
Seyran Saeedi
Tom Arodz
61
32
0
12 Nov 2019
Iteratively Training Look-Up Tables for Network Quantization
Fabien Cardinaux
Stefan Uhlich
K. Yoshiyama
Javier Alonso García
Lukas Mauch
Stephen Tiedemann
Thomas Kemp
Akira Nakamura
MQ
102
16
0
12 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
97
284
0
10 Nov 2019
Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection
Vicent Sanz Marco
Ben Taylor
Ziyi Wang
Y. Elkhatib
70
61
0
09 Nov 2019
Hardware-aware Pruning of DNNs using LFSR-Generated Pseudo-Random Indices
Foroozan Karimzadeh
N. Cao
Brian Crafton
Justin Romberg
A. Raychowdhury
42
13
0
09 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
126
200
0
09 Nov 2019
A Simplified Fully Quantized Transformer for End-to-end Speech Recognition
Alex Bie
Bharat Venkitesh
João Monteiro
Md. Akmal Haidar
Mehdi Rezagholizadeh
MQ
144
27
0
09 Nov 2019
Deep geometric knowledge distillation with graphs
Carlos Lassance
Myriam Bontonou
G. B. Hacene
Vincent Gripon
Jian Tang
Antonio Ortega
64
39
0
08 Nov 2019
The Pitfall of Evaluating Performance on Emerging AI Accelerators
Zihan Jiang
Jiansong Li
Jiangfeng Zhan
47
2
0
08 Nov 2019
Sparsity through evolutionary pruning prevents neuronal networks from overfitting
Richard C. Gerum
A. Erpenbeck
P. Krauss
A. Schilling
63
55
0
07 Nov 2019
MLPerf Inference Benchmark
Vijayarāghava Reḍḍī
C. Cheng
David Kanter
Pete H Mattson
Guenther Schmuelling
...
Bing Yu
George Y. Yuan
Aaron Zhong
P. Zhang
Yuchen Zhou
133
510
0
06 Nov 2019
A Programmable Approach to Neural Network Compression
Vinu Joseph
Saurav Muralidharan
Animesh Garg
M. Garland
Ganesh Gopalakrishnan
80
10
0
06 Nov 2019
Deep Compressed Pneumonia Detection for Low-Power Embedded Devices
Hongjia Li
Sheng Lin
Ning Liu
Caiwen Ding
Yanzhi Wang
56
1
0
04 Nov 2019
SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network
R. Yazdani
Olatunji Ruwase
Minjia Zhang
Yuxiong He
J. Arnau
Antonio González
63
5
0
04 Nov 2019
Ternary MobileNets via Per-Layer Hybrid Filter Banks
Dibakar Gope
Jesse G. Beu
Urmish Thakker
Matthew Mattina
MQ
71
15
0
04 Nov 2019
Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization
Lei Deng
Yujie Wu
Yifan Hu
Ling Liang
Guoqi Li
Xing Hu
Yufei Ding
Peng Li
Yuan Xie
80
85
0
03 Nov 2019
On-Device Machine Learning: An Algorithms and Learning Theory Perspective
Sauptik Dhar
Junyao Guo
Jiayi Liu
S. Tripathi
Unmesh Kurup
Mohak Shah
148
144
0
02 Nov 2019
Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters
Niccoló Nicodemo
Gaurav Naithani
Konstantinos Drossos
Tuomas Virtanen
R. Saletti
MQ
18
1
0
01 Nov 2019
Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Michael Kuchnik
George Amvrosiadis
Virginia Smith
105
9
0
01 Nov 2019
Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers
Xishan Zhang
Shaoli Liu
Rui Zhang
Chang-Shu Liu
Di Huang
...
Jiaming Guo
Yu Kang
Qi Guo
Zidong Du
Yunji Chen
MQ
56
7
0
01 Nov 2019
ALERT: Accurate Learning for Energy and Timeliness
Chengcheng Wan
M. Santriaji
E. Rogers
H. Hoffmann
Michael Maire
Shan Lu
AI4CE
113
42
0
31 Oct 2019
RIGA: Covert and Robust White-Box Watermarking of Deep Neural Networks
Tianhao Wang
Florian Kerschbaum
AAML
100
36
0
31 Oct 2019
Distilling Pixel-Wise Feature Similarities for Semantic Segmentation
Yuhu Shan
47
7
0
31 Oct 2019
Towards Scalable, Efficient and Accurate Deep Spiking Neural Networks with Backward Residual Connections, Stochastic Softmax and Hybridization
Priyadarshini Panda
Sai Aparna Aketi
Kaushik Roy
80
33
0
30 Oct 2019
E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings
Yue Wang
Ziyu Jiang
Xiaohan Chen
Pengfei Xu
Yang Zhao
Yingyan Lin
Zhangyang Wang
MQ
116
83
0
29 Oct 2019
Decomposable-Net: Scalable Low-Rank Compression for Neural Networks
A. Yaguchi
Taiji Suzuki
Shuhei Nitta
Y. Sakata
A. Tanizawa
88
9
0
29 Oct 2019
Active Subspace of Neural Networks: Structural Analysis and Universal Attacks
Chunfeng Cui
Kaiqi Zhang
Talgat Daulbaev
Julia Gusak
Ivan Oseledets
Zheng Zhang
AAML
69
25
0
29 Oct 2019
A Simple but Effective BERT Model for Dialog State Tracking on Resource-Limited Systems
T. Lai
Quan Hung Tran
Trung Bui
Daisuke Kihara
43
29
0
28 Oct 2019
Secure Evaluation of Quantized Neural Networks
Anders Dalskov
Daniel E. Escudero
Marcel Keller
119
143
0
28 Oct 2019
Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework
Srinidhi Hegde
Ranjitha Prasad
R. Hebbalaguppe
Vishwajith Kumar
40
18
0
26 Oct 2019
Structural sparsification for Far-field Speaker Recognition with GNA
Jingchi Zhang
Jonathan Huang
Michael Deisher
Hai Helen Li
Yiran Chen
38
0
0
25 Oct 2019
A Comparative Study of Neural Network Compression
H. Baktash
Emanuele Natale
L. Viennot
AAML
13
0
0
24 Oct 2019
An Adaptive Empirical Bayesian Method for Sparse Deep Learning
Wei Deng
Xiao Zhang
F. Liang
Guang Lin
BDL
126
44
0
23 Oct 2019
EdgeAI: A Vision for Deep Learning in IoT Era
Kartikeya Bhardwaj
Naveen Suda
R. Marculescu
36
12
0
23 Oct 2019
Improving singing voice separation with the Wave-U-Net using Minimum Hyperspherical Energy
Joaquin Perez-Lapillo
Oleksandr Galkin
Tillman Weyde
53
12
0
22 Oct 2019
Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks
Yihui He
Jianing Qian
Jianren Wang
Cindy X. Le
Congrui Hetang
Qi Lyu
Wenping Wang
Tianwei Yue
101
11
0
21 Oct 2019
Building Efficient CNNs Using Depthwise Convolutional Eigen-Filters (DeCEF)
Yinan Yu
Samuel Scheidegger
T. McKelvey
44
2
0
21 Oct 2019
Self-Adaptive Network Pruning
Jinting Chen
Zhaocheng Zhu
Chengwei Li
Yuming Zhao
3DPC
50
22
0
20 Oct 2019
Fully Quantized Transformer for Machine Translation
Gabriele Prato
Ella Charlaix
Mehdi Rezagholizadeh
MQ
79
70
0
17 Oct 2019
SPEC2: SPECtral SParsE CNN Accelerator on FPGAs
Yue Niu
Hanqing Zeng
Ajitesh Srivastava
Kartik Lakhotia
Rajgopal Kannan
Yanzhi Wang
Viktor Prasanna
MQ
51
8
0
16 Oct 2019
Compacting, Picking and Growing for Unforgetting Continual Learning
Steven C. Y. Hung
Cheng-Hao Tu
Cheng-En Wu
Chien-Hung Chen
Yi-Ming Chan
Chu-Song Chen
CLL
134
318
0
15 Oct 2019
TCD-NPE: A Re-configurable and Efficient Neural Processing Engine, Powered by Novel Temporal-Carry-deferring MACs
Ali Mirzaeian
Houman Homayoun
Avesta Sasan
BDL
36
10
0
14 Oct 2019
Q8BERT: Quantized 8Bit BERT
Ofir Zafrir
Guy Boudoukh
Peter Izsak
Moshe Wasserblat
MQ
121
507
0
14 Oct 2019
Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-based Approach
Haichuan Yang
Shupeng Gui
Yuhao Zhu
Ji Liu
MQ
76
5
0
14 Oct 2019
Previous
1
2
3
...
47
48
49
...
68
69
70
Next