ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown
Title
Pruning vs Quantization: Which is Better?
Pruning vs Quantization: Which is Better?
Andrey Kuzmin
Markus Nagel
M. V. Baalen
Arash Behboodi
Tijmen Blankevoort
MQ
139
56
0
06 Jul 2023
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for
  Efficient LLM Inference
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
Luciano Del Corro
Allison Del Giorno
Sahaj Agarwal
Ting Yu
Ahmed Hassan Awadallah
Subhabrata Mukherjee
135
61
0
05 Jul 2023
Make A Long Image Short: Adaptive Token Length for Vision Transformers
Make A Long Image Short: Adaptive Token Length for Vision Transformers
Yuqin Zhu
Yichen Zhu
ViT
125
17
0
05 Jul 2023
Why do CNNs excel at feature extraction? A mathematical explanation
Why do CNNs excel at feature extraction? A mathematical explanation
V. Nandakumar
Arush Tagade
Tongliang Liu
FAtt
45
0
0
03 Jul 2023
Structured Network Pruning by Measuring Filter-wise Interactions
Structured Network Pruning by Measuring Filter-wise Interactions
Wenting Tang
Xingxing Wei
Yue Liu
49
0
0
03 Jul 2023
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Jun Chen
Shipeng Bai
Tianxin Huang
Mengmeng Wang
Guanzhong Tian
Y. Liu
MQ
119
19
0
02 Jul 2023
Filter Pruning for Efficient CNNs via Knowledge-driven Differential
  Filter Sampler
Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler
Shaohui Lin
Wenxuan Huang
Jiao Xie
Baochang Zhang
Yunhang Shen
Zhou Yu
Jungong Han
David Doermann
64
2
0
01 Jul 2023
Miniaturized Graph Convolutional Networks with Topologically Consistent
  Pruning
Miniaturized Graph Convolutional Networks with Topologically Consistent Pruning
H. Sahbi
63
0
0
30 Jun 2023
Systematic Investigation of Sparse Perturbed Sharpness-Aware
  Minimization Optimizer
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Tianshuo Xu
Xiaoshuai Sun
Tongliang Liu
Rongrong Ji
Dacheng Tao
AAML
75
2
0
30 Jun 2023
OSP: Boosting Distributed Model Training with 2-stage Synchronization
OSP: Boosting Distributed Model Training with 2-stage Synchronization
Zixuan Chen
Lei Shi
Xuandong Liu
Jiahui Li
Sen Liu
Yang Xu
113
4
0
29 Jun 2023
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN
  Inference
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference
Bahareh Khabbazan
Marc Riera
Antonio González
MQ
81
3
0
28 Jun 2023
SparseOptimizer: Sparsify Language Models through Moreau-Yosida
  Regularization and Accelerate via Compiler Co-design
SparseOptimizer: Sparsify Language Models through Moreau-Yosida Regularization and Accelerate via Compiler Co-design
Fu-Ming Guo
MoE
119
0
0
27 Jun 2023
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
Weiming Zhuang
Chen Chen
Lingjuan Lyu
Chong Chen
Yaochu Jin
Lingjuan Lyu
AIFinAI4CE
239
99
0
27 Jun 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
116
12
0
25 Jun 2023
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and
  Customized Hardware
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Shichang Zhang
Atefeh Sohrabizadeh
Cheng Wan
Zijie Huang
Ziniu Hu
Yewen Wang
Yingyan Lin
Lin
Jason Cong
Yizhou Sun
GNNAI4CE
103
25
0
24 Jun 2023
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
  Language Models
H2_22​O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
229
315
0
24 Jun 2023
Maintaining Plasticity in Deep Continual Learning
Maintaining Plasticity in Deep Continual Learning
Shibhansh Dohare
J. F. Hernandez-Garcia
Parash Rahman
A. Rupam Mahmood
Richard S. Sutton
KELMCLL
99
30
0
23 Jun 2023
Swin-Free: Achieving Better Cross-Window Attention and Efficiency with
  Size-varying Window
Swin-Free: Achieving Better Cross-Window Attention and Efficiency with Size-varying Window
Jinkyu Koo
John Yang
Le An
Gwenaelle Cunha Sergio
Su Inn Park
ViT
57
0
0
23 Jun 2023
Binary domain generalization for sparsifying binary neural networks
Binary domain generalization for sparsifying binary neural networks
Riccardo Schiavone
Francesco Galati
Maria A. Zuluaga
MQ
79
0
0
23 Jun 2023
Neural Network Pruning for Real-time Polyp Segmentation
Neural Network Pruning for Real-time Polyp Segmentation
Suman Sapkota
Pranav Poudel
Sudarshan Regmi
Bibek Panthi
Binod Bhattarai
MedIm
80
0
0
22 Jun 2023
MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at
  the Consumer Edge
MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at the Consumer Edge
Sokratis Nikolaidis
Stylianos I. Venieris
I. Venieris
59
2
0
22 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
188
443
0
20 Jun 2023
Towards Environmentally Equitable AI via Geographical Load Balancing
Towards Environmentally Equitable AI via Geographical Load Balancing
Pengfei Li
Jianyi Yang
Adam Wierman
Shaolei Ren
94
12
0
20 Jun 2023
Dynamic Perceiver for Efficient Visual Recognition
Dynamic Perceiver for Efficient Visual Recognition
Yizeng Han
Dongchen Han
Zeyu Liu
Yulin Wang
Xuran Pan
Yifan Pu
Chaorui Deng
Junlan Feng
S. Song
Gao Huang
108
30
0
20 Jun 2023
AI Clinics on Mobile (AICOM): Universal AI Doctors for the Underserved
  and Hard-to-Reach
AI Clinics on Mobile (AICOM): Universal AI Doctors for the Underserved and Hard-to-Reach
Tim Tianyi Yang
T. Yang
Na An
Ao Kong
Shaoshan Liu
Xue Liu
42
2
0
17 Jun 2023
Lightweight Attribute Localizing Models for Pedestrian Attribute
  Recognition
Lightweight Attribute Localizing Models for Pedestrian Attribute Recognition
Ashish Jha
Dimitrii Ermilov
Konstantin Sobolev
Anh-Huy Phan
S. Ahmadi-Asl
...
Imran N. Junejo
Z. Aghbari
Thar Baker
A. Khedr
A. Cichocki
CVBM
36
1
0
16 Jun 2023
HiNeRV: Video Compression with Hierarchical Encoding-based Neural
  Representation
HiNeRV: Video Compression with Hierarchical Encoding-based Neural Representation
Ho Man Kwan
Ge Gao
Fan Zhang
Andrew Gower
David Bull
82
54
0
16 Jun 2023
[Experiments & Analysis] Evaluating the Feasibility of Sampling-Based
  Techniques for Training Multilayer Perceptrons
[Experiments & Analysis] Evaluating the Feasibility of Sampling-Based Techniques for Training Multilayer Perceptrons
Sana Ebrahimi
Rishi Advani
Abolfazl Asudeh
136
0
0
15 Jun 2023
Understanding Parameter Sharing in Transformers
Understanding Parameter Sharing in Transformers
Ye Lin
Mingxuan Wang
Zhexi Zhang
Xiaohui Wang
Tong Xiao
Jingbo Zhu
MoE
79
2
0
15 Jun 2023
Neural Network Compression using Binarization and Few Full-Precision
  Weights
Neural Network Compression using Binarization and Few Full-Precision Weights
F. M. Nardini
Cosimo Rulli
Salvatore Trani
Rossano Venturini
MQ
88
1
0
15 Jun 2023
High-performance deep spiking neural networks with 0.3 spikes per neuron
High-performance deep spiking neural networks with 0.3 spikes per neuron
A. Stanojević
Stanislaw Wo'zniak
G. Bellec
G. Cherubini
A. Pantazi
W. Gerstner
118
19
0
14 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization
SqueezeLLM: Dense-and-Sparse Quantization
Sehoon Kim
Coleman Hooper
A. Gholami
Zhen Dong
Xiuyu Li
Sheng Shen
Michael W. Mahoney
Kurt Keutzer
MQ
168
198
0
13 Jun 2023
RAMAN: A Re-configurable and Sparse tinyML Accelerator for Inference on
  Edge
RAMAN: A Re-configurable and Sparse tinyML Accelerator for Inference on Edge
Adithya Krishna
Srikanth Rohit Nudurupati
Chandana D G
Pritesh Dwivedi
André van Schaik
M. Mehendale
Chetan Singh Thakur
64
16
0
10 Jun 2023
FalconNet: Factorization for the Light-weight ConvNets
FalconNet: Factorization for the Light-weight ConvNets
Zhicheng Cai
Qiu Shen
125
14
0
10 Jun 2023
MobileNMT: Enabling Translation in 15MB and 30ms
MobileNMT: Enabling Translation in 15MB and 30ms
Ye Lin
Xiaohui Wang
Zhexi Zhang
Mingxuan Wang
Tong Xiao
Jingbo Zhu
MQ
68
2
0
07 Jun 2023
CFDP: Common Frequency Domain Pruning
CFDP: Common Frequency Domain Pruning
Samir Khaki
Weihan Luo
3DV
89
5
0
07 Jun 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The
  Weights that Matter
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zhangyang Wang
VLM
80
34
0
06 Jun 2023
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection
  Capability
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability
Jianing Zhu
Hengzhuang Li
Jiangchao Yao
Tongliang Liu
Jianliang Xu
Bo Han
OODD
80
13
0
06 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDLMQ
296
589
0
01 Jun 2023
FlexRound: Learnable Rounding based on Element-wise Division for
  Post-Training Quantization
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
J. H. Lee
Jeonghoon Kim
S. Kwon
Dongsoo Lee
MQ
124
38
0
01 Jun 2023
Accurate and Structured Pruning for Efficient Automatic Speech
  Recognition
Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Huiqiang Jiang
Li Zhang
Yuang Li
Yu-Huan Wu
Shijie Cao
Ting Cao
Yuqing Yang
Jinyu Li
Mao Yang
Lili Qiu
CVBM
140
12
0
31 May 2023
Vision Transformers for Mobile Applications: A Short Survey
Vision Transformers for Mobile Applications: A Short Survey
Nahid Alam
Steven Kolawole
S. Sethi
Nishant Bansali
Karina Nguyen
ViT
74
4
0
30 May 2023
Budget-Aware Graph Convolutional Network Design using Probabilistic
  Magnitude Pruning
Budget-Aware Graph Convolutional Network Design using Probabilistic Magnitude Pruning
H. Sahbi
64
0
0
30 May 2023
Compact Real-time Radiance Fields with Neural Codebook
Compact Real-time Radiance Fields with Neural Codebook
Lingzhi Li
Zhongshu Wang
Zhen Shen
Li Shen
Ping Tan
85
6
0
29 May 2023
A Transfer Learning and Explainable Solution to Detect mpox from
  Smartphones images
A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images
M. Campana
Marco Colussi
Franca Delmastro
S. Mascetti
Elena Pagani
58
12
0
29 May 2023
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating
  Vision-Language Transformers
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Dachuan Shi
Chaofan Tao
Anyi Rao
Zhendong Yang
Chun Yuan
Jiaqi Wang
VLM
136
23
0
27 May 2023
Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient
  In-Memory Computing
Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing
Yuhang Li
Abhishek Moitra
Tamar Geller
Priyadarshini Panda
65
21
0
27 May 2023
COMCAT: Towards Efficient Compression and Customization of
  Attention-Based Vision Models
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Jinqi Xiao
Miao Yin
Yu Gong
Xiao Zang
Jian Ren
Bo Yuan
VLMViT
134
9
0
26 May 2023
Improving Knowledge Distillation via Regularizing Feature Norm and
  Direction
Improving Knowledge Distillation via Regularizing Feature Norm and Direction
Yuzhu Wang
Lechao Cheng
Manni Duan
Yongheng Wang
Zunlei Feng
Shu Kong
97
22
0
26 May 2023
CUEING: a lightweight model to Capture hUman attEntion In driviNG
CUEING: a lightweight model to Capture hUman attEntion In driviNG
Linfeng Liang
Yao Deng
Yang Zhang
Jianchao Lu
Chen Wang
Quan Z. Sheng
Xi Zheng
91
2
0
25 May 2023
Previous
123...131415...686970
Next