Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,481 papers shown
Title
Structural Knowledge Distillation for Object Detection
Philip de Rijk
Lukas Schneider
Marius Cordts
D. Gavrila
80
25
0
23 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo
Joseph Bethge
Christoph Meinel
Haojin Yang
MQ
121
20
0
23 Nov 2022
Developmental Plasticity-inspired Adaptive Pruning for Deep Spiking and Artificial Neural Networks
Bing Han
Feifei Zhao
Yi Zeng
Guobin Shen
93
6
0
23 Nov 2022
FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative Training
Quan Nguyen
Hieu H. Pham
Kok-Seng Wong
Phi Le Nguyen
Truong Thao Nguyen
Minh N. Do
FedML
112
8
0
20 Nov 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
277
847
0
18 Nov 2022
Structured Pruning Adapters
Lukas Hedegaard
Aman Alok
Juby Jose
Alexandros Iosifidis
77
11
0
17 Nov 2022
Is Smaller Always Faster? Tradeoffs in Compressing Self-Supervised Speech Transformers
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
84
6
0
17 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection
Linfeng Zhang
Yukang Shi
Hung-Shuo Tai
Zhipeng Zhang
Yuan He
Ke Wang
Kaisheng Ma
80
2
0
14 Nov 2022
Pruning Very Deep Neural Network Channels for Efficient Inference
Yihui He
59
1
0
14 Nov 2022
Robust Training of Graph Neural Networks via Noise Governance
Siyi Qian
Haochao Ying
Renjun Hu
Jingbo Zhou
Jintai Chen
Benlin Liu
Jian Wu
NoLa
116
37
0
12 Nov 2022
Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition
Yashesh Gaur
Nick Kibre
Jian Xue
Kangyuan Shu
Yuhui Wang
Issac Alphonso
Jinyu Li
Jiawei Liu
34
7
0
07 Nov 2022
RUBICON: A Framework for Designing Efficient Deep Learning-Based Genomic Basecallers
Gagandeep Singh
M. Alser
K. Denolf
Can Firtina
Alireza Khodamoradi
Meryem Banu Cavlak
Henk Corporaal
O. Mutlu
65
13
0
06 Nov 2022
Multi-Objective Evolutionary for Object Detection Mobile Architectures Search
Haichao Zhang
Jiashi Li
Xin Xia
K. Hao
Xuefeng Xiao
86
2
0
05 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
93
14
0
04 Nov 2022
Soft Masking for Cost-Constrained Channel Pruning
Ryan Humble
Maying Shen
J. Latorre
Eric Darve1
J. Álvarez
59
14
0
04 Nov 2022
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Zhekai Zhang
Ji Lin
Chenlin Meng
Stefano Ermon
Song Han
Jun-Yan Zhu
DiffM
158
49
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
168
9
0
02 Nov 2022
Model Compression for DNN-based Speaker Verification Using Weight Quantization
Jingyu Li
W. Liu
Zhaoyang Zhang
Jiong Wang
Tan Lee
MQ
80
3
0
31 Oct 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
62
2
0
31 Oct 2022
LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning
Jenny Yang
Jaeuk Kim
Joo-Young Kim
61
2
0
29 Oct 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
69
3
0
28 Oct 2022
Class Based Thresholding in Early Exit Semantic Segmentation Networks
Alperen Görmez
Erdem Koyuncu
79
5
0
27 Oct 2022
Efficient ECG-based Atrial Fibrillation Detection via Parameterised Hypercomplex Neural Networks
Leonie Basso
Zhao Ren
Wolfgang Nejdl
134
2
0
27 Oct 2022
Gradient-based Weight Density Balancing for Robust Dynamic Sparse Training
Mathias Parger
Alexander Ertl
Paul Eibensteiner
J. H. Mueller
Martin Winter
M. Steinberger
72
0
0
25 Oct 2022
Pruning's Effect on Generalization Through the Lens of Training and Regularization
Tian Jin
Michael Carbin
Daniel M. Roy
Jonathan Frankle
Gintare Karolina Dziugaite
84
30
0
25 Oct 2022
Pushing the Efficiency Limit Using Structured Sparse Convolutions
Vinay Kumar Verma
Nikhil Mehta
Shijing Si
Ricardo Henao
Lawrence Carin
52
3
0
23 Oct 2022
Towards Global Neural Network Abstractions with Locally-Exact Reconstruction
Edoardo Manino
I. Bessa
Lucas C. Cordeiro
76
1
0
21 Oct 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
131
10
0
21 Oct 2022
Learning Robust Dynamics through Variational Sparse Gating
A. Jain
Shivakanth Sujit
S. Joshi
Vincent Michalski
Danijar Hafner
Samira Ebrahimi Kahou
73
9
0
21 Oct 2022
Pruning by Active Attention Manipulation
Z. Babaiee
Lucas Liebenwein
Ramin Hasani
Daniela Rus
Radu Grosu
79
0
0
20 Oct 2022
Attaining Class-level Forgetting in Pretrained Model using Few Samples
Pravendra Singh
Pratik Mazumder
M. A. Karim
VLM
CLL
MU
47
1
0
19 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Muralidhar Andoorveedu
Zhanda Zhu
Bojian Zheng
Gennady Pekhimenko
51
7
0
19 Oct 2022
Packed-Ensembles for Efficient Uncertainty Estimation
Olivier Laurent
Adrien Lafage
Enzo Tartaglione
Geoffrey Daniel
Jean-Marc Martinez
Andrei Bursuc
Gianni Franchi
OODD
148
32
0
17 Oct 2022
Approximating Continuous Convolutions for Deep Network Compression
Theo W. Costain
V. Prisacariu
71
0
0
17 Oct 2022
HQNAS: Auto CNN deployment framework for joint quantization and architecture search
Hongjiang Chen
Yang Wang
Leibo Liu
Shaojun Wei
Shouyi Yin
MQ
39
2
0
16 Oct 2022
The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge
Erqian Tang
Xiaotian Guo
T. Stefanov
45
1
0
15 Oct 2022
Deep Differentiable Logic Gate Networks
Felix Petersen
Christian Borgelt
Hilde Kuehne
Oliver Deussen
AI4CE
65
30
0
15 Oct 2022
Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks
Cecilia Latotzke
Batuhan Balim
T. Gemmeke
MQ
23
2
0
14 Oct 2022
CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision Models
Denis Kuznedelev
Eldar Kurtic
Elias Frantar
Dan Alistarh
VLM
ViT
88
13
0
14 Oct 2022
Parameter-Efficient Masking Networks
Yue Bai
Huan Wang
Xu Ma
Yitian Zhang
Zhiqiang Tao
Yun Fu
69
10
0
13 Oct 2022
Structural Pruning via Latency-Saliency Knapsack
Maying Shen
Hongxu Yin
Pavlo Molchanov
Lei Mao
Jianna Liu
J. Álvarez
102
50
0
13 Oct 2022
SeKron: A Decomposition Method Supporting Many Factorization Structures
Marawan Gamal Abdel Hameed
A. Mosleh
Marzieh S. Tahaei
V. Nia
68
1
0
12 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
53
19
0
11 Oct 2022
Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and Supervised Learning
Tinghao Zhang
Zhijun Li
Yongrui Chen
Kwok-Yan Lam
Jun Zhao
52
4
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
128
72
0
11 Oct 2022
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
61
2
0
11 Oct 2022
DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural Networks
Simin Chen
Mirazul Haque
Cong Liu
Wei Yang
110
22
0
10 Oct 2022
Advancing Model Pruning via Bi-level Optimization
Yihua Zhang
Yuguang Yao
Parikshit Ram
Pu Zhao
Tianlong Chen
Min-Fong Hong
Yanzhi Wang
Sijia Liu
173
68
0
08 Oct 2022
Demand Layering for Real-Time DNN Inference with Minimized Memory Usage
Min-Zhi Ji
Saehanseul Yi
Chang-Mo Koo
Sol Ahn
Dongjoo Seo
N. Dutt
Jong-Chan Kim
116
17
0
08 Oct 2022
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile Networks
Kaibin Huang
Hai Wu
Zhiyan Liu
Xiaojuan Qi
72
10
0
07 Oct 2022
Previous
1
2
3
...
18
19
20
...
68
69
70
Next