ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown
Title
Efficient Deep Learning Using Non-Volatile Memory Technology
Efficient Deep Learning Using Non-Volatile Memory Technology
A. Inci
Mehmet Meric Isgenc
Diana Marculescu
108
3
0
27 Jun 2022
CTMQ: Cyclic Training of Convolutional Neural Networks with Multiple
  Quantization Steps
CTMQ: Cyclic Training of Convolutional Neural Networks with Multiple Quantization Steps
Hyunjin Kim
Jungwoon Shin
Alberto A. Del Barrio
MQ
63
2
0
26 Jun 2022
Representative Teacher Keys for Knowledge Distillation Model Compression
  Based on Attention Mechanism for Image Classification
Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification
Jun-Teng Yang
Sheng-Che Kao
S. Huang
60
0
0
26 Jun 2022
Training Your Sparse Neural Network Better with Any Mask
Training Your Sparse Neural Network Better with Any Mask
Ajay Jaiswal
Haoyu Ma
Tianlong Chen
Ying Ding
Zhangyang Wang
CVBM
137
36
0
26 Jun 2022
p-Meta: Towards On-device Deep Model Adaptation
p-Meta: Towards On-device Deep Model Adaptation
Zhongnan Qu
Zimu Zhou
Yongxin Tong
Lothar Thiele
75
13
0
25 Jun 2022
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of
  Weight Importance
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Qingru Zhang
Simiao Zuo
Chen Liang
Alexander Bukharin
Pengcheng He
Weizhu Chen
T. Zhao
83
81
0
25 Jun 2022
Computational Complexity Evaluation of Neural Network Applications in
  Signal Processing
Computational Complexity Evaluation of Neural Network Applications in Signal Processing
Pedro J. Freire
S. Srivallapanondh
A. Napoli
Jaroslaw E. Prilepsky
S. Turitsyn
94
1
0
24 Jun 2022
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient
  Inference in Large-Scale Generative Language Models
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models
Gunho Park
Baeseong Park
Minsub Kim
Sungjae Lee
Jeonghoon Kim
Beomseok Kwon
S. Kwon
Byeongwook Kim
Youngjoo Lee
Dongsoo Lee
MQ
111
85
0
20 Jun 2022
Augmented Imagefication: A Data-driven Fault Detection Method for Aircraft Air Data Sensors
Hang Zhao
Jinyi Ma
Zhongzhi Li
Yiqun Dong
J. Ai
117
0
0
18 Jun 2022
Channel-wise Mixed-precision Assignment for DNN Inference on Constrained
  Edge Nodes
Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes
Matteo Risso
Luca Bompani
Luca Benini
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
MQ
68
12
0
17 Jun 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Zhengqi He
Zeke Xie
Quanzhi Zhu
Zengchang Qin
146
28
0
17 Jun 2022
PRANC: Pseudo RAndom Networks for Compacting deep models
PRANC: Pseudo RAndom Networks for Compacting deep models
Parsa Nooralinejad
Ali Abbasi
Soroush Abbasi Koohpayegani
Kossar Pourahmadi Meibodi
Rana Muhammad Shahroz Khan
Soheil Kolouri
Hamed Pirsiavash
DD
104
0
0
16 Jun 2022
Asymptotic Soft Cluster Pruning for Deep Neural Networks
Asymptotic Soft Cluster Pruning for Deep Neural Networks
Tao Niu
Yinglei Teng
Panpan Zou
28
2
0
16 Jun 2022
"Understanding Robustness Lottery": A Geometric Visual Comparative
  Analysis of Neural Network Pruning Approaches
"Understanding Robustness Lottery": A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches
Zhimin Li
Shusen Liu
Xin Yu
Kailkhura Bhavya
Jie Cao
Diffenderfer James Daniel
P. Bremer
Valerio Pascucci
AAML
98
1
0
16 Jun 2022
Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness
Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness
Tianlong Chen
Huan Zhang
Zhenyu Zhang
Shiyu Chang
Sijia Liu
Pin-Yu Chen
Zhangyang Wang
AAML
63
11
0
15 Jun 2022
Structured Sparsity Learning for Efficient Video Super-Resolution
Structured Sparsity Learning for Efficient Video Super-Resolution
Bin Xia
Jingwen He
Yulun Zhang
Yitong Wang
Yapeng Tian
Wenming Yang
Luc Van Gool
57
21
0
15 Jun 2022
QONNX: Representing Arbitrary-Precision Quantized Neural Networks
QONNX: Representing Arbitrary-Precision Quantized Neural Networks
Alessandro Pappalardo
Yaman Umuroglu
Michaela Blott
Jovan Mitrevski
B. Hawks
...
J. Muhizi
Matthew Trahms
Shih-Chieh Hsu
Scott Hauck
Javier Mauricio Duarte
MQ
41
18
0
15 Jun 2022
Hardening DNNs against Transfer Attacks during Network Compression using
  Greedy Adversarial Pruning
Hardening DNNs against Transfer Attacks during Network Compression using Greedy Adversarial Pruning
Jonah O'Brien Weiss
Tiago A. O. Alves
S. Kundu
AAML
33
0
0
15 Jun 2022
Energy Consumption Analysis of pruned Semantic Segmentation Networks on
  an Embedded GPU
Energy Consumption Analysis of pruned Semantic Segmentation Networks on an Embedded GPU
Hugo Tessier
Vincent Gripon
Mathieu Léonardon
M. Arzel
David Bertrand
T. Hannagan
GNNSSeg3DPC
70
2
0
13 Jun 2022
Leveraging Structured Pruning of Convolutional Neural Networks
Leveraging Structured Pruning of Convolutional Neural Networks
Hugo Tessier
Vincent Gripon
Mathieu Léonardon
M. Arzel
David Bertrand
T. Hannagan
CVBM
63
1
0
13 Jun 2022
A Directed-Evolution Method for Sparsification and Compression of Neural
  Networks with Application to Object Identification and Segmentation and
  considerations of optimal quantization using small number of bits
A Directed-Evolution Method for Sparsification and Compression of Neural Networks with Application to Object Identification and Segmentation and considerations of optimal quantization using small number of bits
L. Franca-Neto
28
0
0
12 Jun 2022
A Theoretical Understanding of Neural Network Compression from Sparse
  Linear Approximation
A Theoretical Understanding of Neural Network Compression from Sparse Linear Approximation
Wenjing Yang
G. Wang
Jie Ding
Yuhong Yang
MLT
71
7
0
11 Jun 2022
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
Tianlong Chen
Zhenyu Zhang
Sijia Liu
Yang Zhang
Shiyu Chang
Zhangyang Wang
AAML
79
8
0
09 Jun 2022
DiSparse: Disentangled Sparsification for Multitask Model Compression
DiSparse: Disentangled Sparsification for Multitask Model Compression
Xing Sun
Ali Hassani
Zhangyang Wang
Gao Huang
Humphrey Shi
137
21
0
09 Jun 2022
Swan: A Neural Engine for Efficient DNN Training on Smartphone SoCs
Swan: A Neural Engine for Efficient DNN Training on Smartphone SoCs
Sanjay Sri Vallabh Singapuram
Fan Lai
Chuheng Hu
Mosharaf Chowdhury
74
5
0
09 Jun 2022
Neural Network Compression via Effective Filter Analysis and
  Hierarchical Pruning
Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning
Ziqi Zhou
Li Lian
Yilong Yin
Ze Wang
41
1
0
07 Jun 2022
Recall Distortion in Neural Network Pruning and the Undecayed Pruning
  Algorithm
Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm
Aidan Good
Jia-Huei Lin
Hannah Sieg
Mikey Ferguson
Xin Yu
Shandian Zhe
J. Wieczorek
Thiago Serra
108
11
0
07 Jun 2022
Compilation and Optimizations for Efficient Machine Learning on Embedded
  Systems
Compilation and Optimizations for Efficient Machine Learning on Embedded Systems
Xiaofan Zhang
Yao Chen
Cong Hao
Sitao Huang
Yuhong Li
Deming Chen
84
1
0
06 Jun 2022
GAAF: Searching Activation Functions for Binary Neural Networks through
  Genetic Algorithm
GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm
Yanfei Li
Tong Geng
S. Stein
Ang Li
Hui-Ling Yu
MQ
80
8
0
05 Jun 2022
Dynamic Kernel Selection for Improved Generalization and Memory
  Efficiency in Meta-learning
Dynamic Kernel Selection for Improved Generalization and Memory Efficiency in Meta-learning
Arnav Chavan
Rishabh Tiwari
Udbhav Bamba
D. K. Gupta
81
5
0
03 Jun 2022
Canonical convolutional neural networks
Canonical convolutional neural networks
Lokesh Veeramacheneni
Moritz Wolter
Reinhard Klein
Jochen Garcke
62
4
0
03 Jun 2022
Fine-tuning Language Models over Slow Networks using Activation
  Compression with Guarantees
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
AI4CE
144
13
0
02 Jun 2022
DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
Y. Fu
Haichuan Yang
Jiayi Yuan
Meng Li
Cheng Wan
Raghuraman Krishnamoorthi
Vikas Chandra
Yingyan Lin
149
19
0
02 Jun 2022
Distributed Training for Deep Learning Models On An Edge Computing
  Network Using ShieldedReinforcement Learning
Distributed Training for Deep Learning Models On An Edge Computing Network Using ShieldedReinforcement Learning
Tanmoy Sen
Haiying Shen
OffRL
84
5
0
01 Jun 2022
Rotate the ReLU to implicitly sparsify deep networks
Rotate the ReLU to implicitly sparsify deep networks
Nancy Nayak
Sheetal Kalyani
29
0
0
01 Jun 2022
Bayesian Learning to Discover Mathematical Operations in Governing
  Equations of Dynamic Systems
Bayesian Learning to Discover Mathematical Operations in Governing Equations of Dynamic Systems
Hongpeng Zhou
W. Pan
39
4
0
01 Jun 2022
ORC: Network Group-based Knowledge Distillation using Online Role Change
ORC: Network Group-based Knowledge Distillation using Online Role Change
Jun-woo Choi
Hyeon Cho
Seockhwa Jeong
Wonjun Hwang
48
3
0
01 Jun 2022
Superposing Many Tickets into One: A Performance Booster for Sparse
  Neural Network Training
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training
Lu Yin
Vlado Menkovski
Meng Fang
Tianjin Huang
Yulong Pei
Mykola Pechenizkiy
Decebal Constantin Mocanu
Shiwei Liu
128
8
0
30 May 2022
STN: Scalable Tensorizing Networks via Structure-Aware Training and
  Adaptive Compression
STN: Scalable Tensorizing Networks via Structure-Aware Training and Adaptive Compression
Chang Nie
Haiquan Wang
Lu Zhao
44
0
0
30 May 2022
A Transistor Operations Model for Deep Learning Energy Consumption
  Scaling Law
A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law
Chen Li
Antonios Tsourdos
Weisi Guo
AI4CE
61
3
0
30 May 2022
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
Y. Tan
Pihe Hu
L. Pan
Jiatai Huang
Longbo Huang
OffRL
82
25
0
30 May 2022
Deep Learning Methods for Fingerprint-Based Indoor Positioning: A Review
Deep Learning Methods for Fingerprint-Based Indoor Positioning: A Review
Fahad Al-homayani
Mohammad H. Mahoor
89
66
0
30 May 2022
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense
  Prediction
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Han Cai
Junyan Li
Muyan Hu
Chuang Gan
Song Han
107
58
0
29 May 2022
Machine Learning for Microcontroller-Class Hardware: A Review
Machine Learning for Microcontroller-Class Hardware: A Review
Swapnil Sayan Saha
S. Sandha
Mani B. Srivastava
111
125
0
29 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
457
2,299
0
27 May 2022
Can Foundation Models Help Us Achieve Perfect Secrecy?
Can Foundation Models Help Us Achieve Perfect Secrecy?
Simran Arora
Christopher Ré
FedML
92
8
0
27 May 2022
A Low Memory Footprint Quantized Neural Network for Depth Completion of
  Very Sparse Time-of-Flight Depth Maps
A Low Memory Footprint Quantized Neural Network for Depth Completion of Very Sparse Time-of-Flight Depth Maps
Xiao-Yan Jiang
V. Cambareri
Gianluca Agresti
C. Ugwu
Adriano Simonetto
Fabien Cardinaux
Pietro Zanuttigh
3DVMQ
77
9
0
25 May 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
  Compressible Models
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
116
20
0
25 May 2022
Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free
Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free
Tianlong Chen
Zhenyu Zhang
Yihua Zhang
Shiyu Chang
Sijia Liu
Zhangyang Wang
AAML
80
25
0
24 May 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
Peng Hu
Xi Peng
Erik Cambria
M. Aly
Jie Lin
MQ
108
62
0
23 May 2022
Previous
123...212223...686970
Next