ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown
Title
Deep CNNs for Peripheral Blood Cell Classification
Deep CNNs for Peripheral Blood Cell Classification
Ekta Gavas
Kaustubh Olpadkar
49
9
0
18 Oct 2021
BERMo: What can BERT learn from ELMo?
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
69
3
0
18 Oct 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic
  Sparse Attention
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
153
45
0
18 Oct 2021
A Dimensionality Reduction Approach for Convolutional Neural Networks
A Dimensionality Reduction Approach for Convolutional Neural Networks
L. Meneghetti
N. Demo
G. Rozza
182
14
0
18 Oct 2021
Finding Everything within Random Binary Networks
Finding Everything within Random Binary Networks
Kartik K. Sreenivasan
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
41
10
0
18 Oct 2021
Network Augmentation for Tiny Deep Learning
Network Augmentation for Tiny Deep Learning
Han Cai
Chuang Gan
Ji Lin
Song Han
133
30
0
17 Oct 2021
Compression-aware Projection with Greedy Dimension Reduction for
  Convolutional Neural Network Activations
Compression-aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations
Yu-Shan Tai
Chieh-Fang Teng
Cheng-Yang Chang
A. Wu
43
8
0
17 Oct 2021
S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based
  Networks
S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks
Shiyu Liu
Chong Min John Tan
Mehul Motani
CLL
68
4
0
17 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of
  Weights and Activations
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations
Xinyu Zhang
Ian Colbert
Ken Kreutz-Delgado
Srinjoy Das
MQ
109
12
0
15 Oct 2021
Kronecker Decomposition for GPT Compression
Kronecker Decomposition for GPT Compression
Ali Edalati
Marzieh S. Tahaei
Ahmad Rashid
V. Nia
J. Clark
Mehdi Rezagholizadeh
102
36
0
15 Oct 2021
Joint Channel and Weight Pruning for Model Acceleration on Moblie
  Devices
Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices
Tianli Zhao
Xi Sheryl Zhang
Wentao Zhu
Jiaxing Wang
Sen Yang
Ji Liu
Jian Cheng
92
2
0
15 Oct 2021
Towards Mixed-Precision Quantization of Neural Networks via Constrained
  Optimization
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
Weihan Chen
Peisong Wang
Jian Cheng
MQ
95
69
0
13 Oct 2021
Improving Binary Neural Networks through Fully Utilizing Latent Weights
Improving Binary Neural Networks through Fully Utilizing Latent Weights
Weixiang Xu
Qiang Chen
Xiangyu He
Peisong Wang
Jian Cheng
MQ
74
6
0
12 Oct 2021
Mining the Weights Knowledge for Optimizing Neural Network Structures
Mining the Weights Knowledge for Optimizing Neural Network Structures
Mengqiao Han
Xiabi Liu
Zhaoyang Hai
Xin Duan
27
1
0
11 Oct 2021
Visualizing the embedding space to explain the effect of knowledge
  distillation
Visualizing the embedding space to explain the effect of knowledge distillation
Hyun Seung Lee
C. Wallraven
79
1
0
09 Oct 2021
Designing the Architecture of a Convolutional Neural Network
  Automatically for Diabetic Retinopathy Diagnosis
Designing the Architecture of a Convolutional Neural Network Automatically for Diabetic Retinopathy Diagnosis
Fahman Saeed
M. Hussain
Hatim Aboalsamh
Fadwa Al Adel
A. Owaifeer
114
6
0
08 Oct 2021
GNN is a Counter? Revisiting GNN for Question Answering
GNN is a Counter? Revisiting GNN for Question Answering
Kuan-Chieh Wang
Yuyu Zhang
Diyi Yang
Le Song
Tao Qin
LMTD
79
31
0
07 Oct 2021
Random matrices in service of ML footprint: ternary random features with
  no performance loss
Random matrices in service of ML footprint: ternary random features with no performance loss
Hafiz Tiomoko Ali
Zhenyu Liao
Romain Couillet
87
7
0
05 Oct 2021
Progressive Transmission and Inference of Deep Learning Models
Progressive Transmission and Inference of Deep Learning Models
Youngsoo Lee
Sangdoo Yun
Yeonghun Kim
Sunghee Choi
54
2
0
03 Oct 2021
One Timestep is All You Need: Training Spiking Neural Networks with
  Ultra Low Latency
One Timestep is All You Need: Training Spiking Neural Networks with Ultra Low Latency
Sayeed Shafayet Chowdhury
Nitin Rathi
Kaushik Roy
87
41
0
01 Oct 2021
Prune Your Model Before Distill It
Prune Your Model Before Distill It
Jinhyuk Park
Albert No
VLM
132
28
0
30 Sep 2021
Convolutional Neural Network Compression through Generalized Kronecker
  Product Decomposition
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition
Marawan Gamal Abdel Hameed
Marzieh S. Tahaei
A. Mosleh
V. Nia
90
26
0
29 Sep 2021
Smart at what cost? Characterising Mobile Deep Neural Networks in the
  wild
Smart at what cost? Characterising Mobile Deep Neural Networks in the wild
Mario Almeida
Stefanos Laskaridis
Abhinav Mehrotra
Łukasz Dudziak
Ilias Leontiadis
Nicholas D. Lane
HAI
161
47
0
28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video
  Understanding on Edge Device
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Wang
Song Han
108
65
0
27 Sep 2021
Deep Structured Instance Graph for Distilling Object Detectors
Deep Structured Instance Graph for Distilling Object Detectors
Yixin Chen
Pengguang Chen
Shu Liu
Liwei Wang
Jiaya Jia
ObjDISeg
69
12
0
27 Sep 2021
Distribution-sensitive Information Retention for Accurate Binary Neural
  Network
Distribution-sensitive Information Retention for Accurate Binary Neural Network
Haotong Qin
Xiangguo Zhang
Ruihao Gong
Yifu Ding
Yi Xu
Xianglong Liu
MQ
73
98
0
25 Sep 2021
Sextans: A Streaming Accelerator for General-Purpose Sparse-Matrix
  Dense-Matrix Multiplication
Sextans: A Streaming Accelerator for General-Purpose Sparse-Matrix Dense-Matrix Multiplication
Linghao Song
Yuze Chi
Atefeh Sohrabizadeh
Young-kyu Choi
Jason Lau
Jason Cong
GNN
105
62
0
22 Sep 2021
Neural network relief: a pruning algorithm based on neural activity
Neural network relief: a pruning algorithm based on neural activity
Aleksandr Dekhovich
David Tax
M. Sluiter
Miguel A. Bessa
120
11
0
22 Sep 2021
Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework
Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework
Mohamed Bennai
Alberto Marchisio
Rachmad Vidya Wicaksana Putra
Muhammad Abdullah Hanif
101
34
0
20 Sep 2021
Learning Versatile Convolution Filters for Efficient Visual Recognition
Learning Versatile Convolution Filters for Efficient Visual Recognition
Kai Han
Yunhe Wang
Chang Xu
Chunjing Xu
Enhua Wu
Dacheng Tao
60
8
0
20 Sep 2021
Comfetch: Federated Learning of Large Networks on Constrained Clients
  via Sketching
Comfetch: Federated Learning of Large Networks on Constrained Clients via Sketching
Tahseen Rabbani
Brandon Yushan Feng
Marco Bornstein
Kyle Rui Sang
Yifan Yang
Arjun Rajkumar
A. Varshney
Furong Huang
FedML
119
2
0
17 Sep 2021
RAPID-RL: A Reconfigurable Architecture with Preemptive-Exits for
  Efficient Deep-Reinforcement Learning
RAPID-RL: A Reconfigurable Architecture with Preemptive-Exits for Efficient Deep-Reinforcement Learning
Adarsh Kosta
Malik Aqeel Anwar
Priyadarshini Panda
A. Raychowdhury
Kaushik Roy
30
4
0
16 Sep 2021
OMPQ: Orthogonal Mixed Precision Quantization
OMPQ: Orthogonal Mixed Precision Quantization
Yuexiao Ma
Taisong Jin
Xiawu Zheng
Yan Wang
Huixia Li
Yongjian Wu
Guannan Jiang
Wei Zhang
Rongrong Ji
MQ
132
38
0
16 Sep 2021
Dense Pruning of Pointwise Convolutions in the Frequency Domain
Dense Pruning of Pointwise Convolutions in the Frequency Domain
Mark Buckler
Neil Adit
Yuwei Hu
Zhiru Zhang
Adrian Sampson
3DPC
53
2
0
16 Sep 2021
Complexity-aware Adaptive Training and Inference for Edge-Cloud
  Distributed AI Systems
Complexity-aware Adaptive Training and Inference for Edge-Cloud Distributed AI Systems
Yinghan Long
I. Chakraborty
G. Srinivasan
Kaushik Roy
61
15
0
14 Sep 2021
AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance
AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance
Xiangcheng Liu
Jian Cao
Hongyi Yao
Wenyu Sun
Yuan Zhang
68
2
0
14 Sep 2021
KroneckerBERT: Learning Kronecker Decomposition for Pre-trained Language
  Models via Knowledge Distillation
KroneckerBERT: Learning Kronecker Decomposition for Pre-trained Language Models via Knowledge Distillation
Marzieh S. Tahaei
Ella Charlaix
V. Nia
A. Ghodsi
Mehdi Rezagholizadeh
112
22
0
13 Sep 2021
On the Compression of Neural Networks Using $\ell_0$-Norm Regularization
  and Weight Pruning
On the Compression of Neural Networks Using ℓ0\ell_0ℓ0​-Norm Regularization and Weight Pruning
F. Oliveira
E. Batista
R. Seara
75
10
0
10 Sep 2021
SONIC: A Sparse Neural Network Inference Accelerator with Silicon
  Photonics for Energy-Efficient Deep Learning
SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning
Febin P. Sunny
Mahdi Nikdast
S. Pasricha
80
22
0
09 Sep 2021
ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and
  Sparse DNNs
ECQx^{\text{x}}x: Explainability-Driven Quantization for Low-Bit and Sparse DNNs
Daniel Becking
Maximilian Dreyer
Wojciech Samek
Karsten Müller
Sebastian Lapuschkin
MQ
358
16
0
09 Sep 2021
SensiX++: Bringing MLOPs and Multi-tenant Model Serving to Sensory Edge
  Devices
SensiX++: Bringing MLOPs and Multi-tenant Model Serving to Sensory Edge Devices
Chulhong Min
Akhil Mathur
Utku Günay Acer
A. Montanari
F. Kawsar
69
12
0
08 Sep 2021
Elastic Significant Bit Quantization and Acceleration for Deep Neural
  Networks
Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks
Cheng Gong
Ye Lu
Kunpeng Xie
Zongming Jin
Tao Li
Yanzhi Wang
MQ
66
7
0
08 Sep 2021
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing
  Deep Neural Networks for Wearables
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables
B. Prabakaran
Asima Akhtar
Semeen Rehman
Osman Hasan
Mohamed Bennai
33
10
0
07 Sep 2021
GDP: Stabilized Neural Network Pruning via Gates with Differentiable
  Polarization
GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization
Yi Guo
Huan Yuan
Jianchao Tan
Zhangyang Wang
Sen Yang
Ji Liu
94
46
0
06 Sep 2021
On the Accuracy of Analog Neural Network Inference Accelerators
On the Accuracy of Analog Neural Network Inference Accelerators
T. Xiao
Ben Feinberg
C. Bennett
V. Prabhakar
Prashant Saxena
V. Agrawal
S. Agarwal
M. Marinella
69
41
0
03 Sep 2021
Architecture Aware Latency Constrained Sparse Neural Networks
Architecture Aware Latency Constrained Sparse Neural Networks
Tianli Zhao
Qinghao Hu
Xiangyu He
Weixiang Xu
Jiaxing Wang
Cong Leng
Jian Cheng
76
0
0
01 Sep 2021
Quantized Convolutional Neural Networks Through the Lens of Partial
  Differential Equations
Quantized Convolutional Neural Networks Through the Lens of Partial Differential Equations
Ido Ben-Yair
Gil Ben Shalom
Moshe Eliasof
Eran Treister
MQ
93
5
0
31 Aug 2021
Efficient Visual Recognition with Deep Neural Networks: A Survey on
  Recent Advances and New Directions
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
Yang Wu
Dingheng Wang
Xiaotong Lu
Fan Yang
Guoqi Li
W. Dong
Jianbo Shi
113
18
0
30 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
81
92
0
30 Aug 2021
Communication-Computation Efficient Device-Edge Co-Inference via AutoML
Communication-Computation Efficient Device-Edge Co-Inference via AutoML
Xinjie Zhang
Jiawei Shao
Yuyi Mao
Jun Zhang
66
8
0
30 Aug 2021
Previous
123...272829...686970
Next