Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,481 papers shown
Title
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher
Kanghyun Choi
Hye Yoon Lee
Deokki Hong
Joonsang Yu
Noseong Park
Youngsok Kim
Jinho Lee
MQ
118
33
0
31 Mar 2022
Physics Community Needs, Tools, and Resources for Machine Learning
Philip C. Harris
E. Katsavounidis
W. McCormack
D. Rankin
Yongbin Feng
...
De-huai Chen
Mark S. Neubauer
Javier Mauricio Duarte
G. Karagiorgi
Miaoyuan Liu
AI4CE
70
4
0
30 Mar 2022
4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Shaojin Ding
Phoenix Meadowlark
Yanzhang He
Lukasz Lew
Shivani Agrawal
Oleg Rybakov
MQ
108
36
0
29 Mar 2022
REx: Data-Free Residual Quantization Error Expansion
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
106
8
0
28 Mar 2022
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks
Hongru Yang
Zhangyang Wang
MLT
106
8
0
27 Mar 2022
SlimFL: Federated Learning with Superposition Coding over Slimmable Neural Networks
Won Joon Yun
Yunseok Kwak
Hankyul Baek
Soyi Jung
Mingyue Ji
M. Bennis
Jihong Park
Joongheon Kim
80
17
0
26 Mar 2022
Playing Lottery Tickets in Style Transfer Models
Meihao Kong
Jing Huo
Wenbin Li
Jing Wu
Yu-kun Lai
Yang Gao
68
1
0
25 Mar 2022
Searching for Network Width with Bilaterally Coupled Network
Xiu Su
Shan You
Jiyang Xie
Fei Wang
Chao Qian
Changshui Zhang
Chang Xu
75
7
0
25 Mar 2022
Lightweight Graph Convolutional Networks with Topologically Consistent Magnitude Pruning
H. Sahbi
GNN
56
1
0
25 Mar 2022
Deformable Butterfly: A Highly Structured and Sparse Linear Transform
R. Lin
Jie Ran
King Hung Chiu
Grazinao Chesi
Ngai Wong
57
15
0
25 Mar 2022
Vision Transformer Compression with Structured Pruning and Low Rank Approximation
Ankur Kumar
ViT
38
6
0
25 Mar 2022
Q-PPG: Energy-Efficient PPG-based Heart Rate Monitoring on Wearable Devices
Luca Bompani
Daniele Jahier Pagliari
Matteo Risso
Simone Benatti
Enrico Macii
Luca Benini
Massimo Poncino
69
41
0
24 Mar 2022
Duality-Induced Regularizer for Semantic Matching Knowledge Graph Embeddings
Jie Wang
Zhanqiu Zhang
Zhihao Shi
Jianyu Cai
Shuiwang Ji
Feng Wu
103
11
0
24 Mar 2022
Bilaterally Slimmable Transformer for Elastic and Efficient Visual Question Answering
Zhou Yu
Zitian Jin
Jun Yu
Mingliang Xu
Hongbo Wang
Jianping Fan
75
4
0
24 Mar 2022
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Ali Hadi Zadeh
Mostafa Mahmoud
Ameer Abdelhadi
Andreas Moshovos
MQ
99
33
0
23 Mar 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
72
7
0
22 Mar 2022
Training Quantised Neural Networks with STE Variants: the Additive Noise Annealing Algorithm
Matteo Spallanzani
G. P. Leonardi
Luca Benini
54
3
0
21 Mar 2022
Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Zuzana Jelčicová
Marian Verhelst
92
5
0
20 Mar 2022
Towards Device Efficient Conditional Image Generation
Nisarg A. Shah
Gaurav Bharaj
103
3
0
19 Mar 2022
Learning Compressed Embeddings for On-Device Inference
Niketan Pansare
J. Katukuri
Aditya Arora
F. Cipollone
R. Shaik
Noyan Tokgozoglu
Chandru Venkataraman
103
15
0
18 Mar 2022
Stability and Risk Bounds of Iterative Hard Thresholding
Xiao-Tong Yuan
P. Li
66
13
0
17 Mar 2022
Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation
Runqi Wang
Linlin Yang
Baochang Zhang
Wentao Zhu
David Doermann
Guodong Guo
50
1
0
17 Mar 2022
PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression
D. Vo
Akihiro Sugimoto
Hideki Nakayama
48
5
0
16 Mar 2022
Unified Visual Transformer Compression
Shixing Yu
Tianlong Chen
Jiayi Shen
Huan Yuan
Jianchao Tan
Sen Yang
Ji Liu
Zhangyang Wang
ViT
99
94
0
15 Mar 2022
Approximability and Generalisation
A. J. Turner
Ata Kabán
64
0
0
15 Mar 2022
LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference
Zhongzhi Yu
Y. Fu
Shang Wu
Mengquan Li
Haoran You
Yingyan Lin
70
1
0
15 Mar 2022
Energy-efficient Dense DNN Acceleration with Signed Bit-slice Architecture
Dongseok Im
Gwangtae Park
Zhiyong Li
Junha Ryu
H. Yoo
MQ
31
0
0
15 Mar 2022
FlexBlock: A Flexible DNN Training Accelerator with Multi-Mode Block Floating Point Support
Seock-Hwan Noh
Jahyun Koo
Seunghyun Lee
Jongse Park
Jaeha Kung
AI4CE
78
18
0
13 Mar 2022
Towards On-Device AI and Blockchain for 6G enabled Agricultural Supply-chain Management
Muhammad Zawish
Nouman Ashraf
R. I. Ansari
Steven Davy
Hassaan Khaliq Qureshi
N. Aslam
S. Hassan
18
12
0
12 Mar 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
104
178
0
11 Mar 2022
DNN Training Acceleration via Exploring GPGPU Friendly Sparsity
Zhuoran Song
Yihong Xu
Han Li
Naifeng Jing
Xiaoyao Liang
Li Jiang
59
3
0
11 Mar 2022
An Empirical Study of Low Precision Quantization for TinyML
Shaojie Zhuo
Hongyu Chen
R. Ramakrishnan
Tommy Chen
Chen Feng
Yi-Rung Lin
Parker Zhang
Liang Shen
MQ
133
13
0
10 Mar 2022
A Brain-Inspired Low-Dimensional Computing Classifier for Inference on Tiny Devices
Shijin Duan
Xiaolin Xu
Shaolei Ren
84
13
0
09 Mar 2022
CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity Prediction
Zhuoran Song
Yihong Xu
Zhezhi He
Li Jiang
Naifeng Jing
Xiaoyao Liang
ViT
83
43
0
09 Mar 2022
The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks
Xin Yu
Thiago Serra
Srikumar Ramalingam
Shandian Zhe
107
49
0
09 Mar 2022
Dual Lottery Ticket Hypothesis
Yue Bai
Haiquan Wang
Zhiqiang Tao
Kunpeng Li
Yun Fu
83
40
0
08 Mar 2022
YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers
Young D. Kwon
Jagmohan Chauhan
Cecilia Mascolo
72
13
0
08 Mar 2022
Differentially Private Federated Learning with Local Regularization and Sparsification
Anda Cheng
Peisong Wang
Xi Sheryl Zhang
Jian Cheng
FedML
77
78
0
07 Mar 2022
Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance
Shiwei Liu
Yuesong Tian
Tianlong Chen
Li Shen
112
11
0
05 Mar 2022
Structured Pruning is All You Need for Pruning CNNs at Initialization
Yaohui Cai
Weizhe Hua
Hongzheng Chen
G. E. Suh
Christopher De Sa
Zhiru Zhang
CVBM
104
15
0
04 Mar 2022
Provable and Efficient Continual Representation Learning
Yingcong Li
Mingchen Li
M. Salman Asif
Samet Oymak
CLL
77
13
0
03 Mar 2022
Fast Neural Architecture Search for Lightweight Dense Prediction Networks
Lam Huynh
Esa Rahtu
Juan E. Sala Matas
J. Heikkilä
96
2
0
03 Mar 2022
SEA: Bridging the Gap Between One- and Two-stage Detector Distillation via SEmantic-aware Alignment
Yixin Chen
Zhuotao Tian
Pengguang Chen
Shu Liu
Jiaya Jia
ObjD
28
1
0
02 Mar 2022
Arrhythmia Classifier Using Convolutional Neural Network with Adaptive Loss-aware Multi-bit Networks Quantization
Hanshi Sun
Ao Wang
Ninghao Pu
Zhiqing Li
Jung Y. Huang
Hao Liu
Zhiyu Qi
MQ
47
4
0
27 Feb 2022
QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning
Hanrui Wang
Zi-Chen Li
Jiaqi Gu
Yongshan Ding
David Z. Pan
Song Han
119
54
0
26 Feb 2022
Extracting Effective Subnetworks with Gumbel-Softmax
Robin Dupont
M. Alaoui
H. Sahbi
A. Lebois
68
6
0
25 Feb 2022
Standard Deviation-Based Quantization for Deep Neural Networks
Amir Ardakani
A. Ardakani
B. Meyer
J. Clark
W. Gross
MQ
95
1
0
24 Feb 2022
Rare Gems: Finding Lottery Tickets at Initialization
Kartik K. Sreenivasan
Jy-yong Sohn
Liu Yang
Matthew Grinde
Alliot Nagle
Hongyi Wang
Eric P. Xing
Kangwook Lee
Dimitris Papailiopoulos
71
42
0
24 Feb 2022
The Larger The Fairer? Small Neural Networks Can Achieve Fairness for Edge Devices
Yi Sheng
Junhuan Yang
Yawen Wu
Kevin Mao
Yiyu Shi
Jingtong Hu
Weiwen Jiang
Lei Yang
113
28
0
23 Feb 2022
Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms
R. Saha
Mert Pilanci
Andrea J. Goldsmith
MQ
96
3
0
23 Feb 2022
Previous
1
2
3
...
23
24
25
...
68
69
70
Next