Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,481 papers shown
Title
Efficient Deep Learning Using Non-Volatile Memory Technology
A. Inci
Mehmet Meric Isgenc
Diana Marculescu
108
3
0
27 Jun 2022
CTMQ: Cyclic Training of Convolutional Neural Networks with Multiple Quantization Steps
Hyunjin Kim
Jungwoon Shin
Alberto A. Del Barrio
MQ
63
2
0
26 Jun 2022
Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification
Jun-Teng Yang
Sheng-Che Kao
S. Huang
60
0
0
26 Jun 2022
Training Your Sparse Neural Network Better with Any Mask
Ajay Jaiswal
Haoyu Ma
Tianlong Chen
Ying Ding
Zhangyang Wang
CVBM
137
36
0
26 Jun 2022
p-Meta: Towards On-device Deep Model Adaptation
Zhongnan Qu
Zimu Zhou
Yongxin Tong
Lothar Thiele
75
13
0
25 Jun 2022
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Qingru Zhang
Simiao Zuo
Chen Liang
Alexander Bukharin
Pengcheng He
Weizhu Chen
T. Zhao
83
81
0
25 Jun 2022
Computational Complexity Evaluation of Neural Network Applications in Signal Processing
Pedro J. Freire
S. Srivallapanondh
A. Napoli
Jaroslaw E. Prilepsky
S. Turitsyn
94
1
0
24 Jun 2022
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models
Gunho Park
Baeseong Park
Minsub Kim
Sungjae Lee
Jeonghoon Kim
Beomseok Kwon
S. Kwon
Byeongwook Kim
Youngjoo Lee
Dongsoo Lee
MQ
111
85
0
20 Jun 2022
Augmented Imagefication: A Data-driven Fault Detection Method for Aircraft Air Data Sensors
Hang Zhao
Jinyi Ma
Zhongzhi Li
Yiqun Dong
J. Ai
117
0
0
18 Jun 2022
Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes
Matteo Risso
Luca Bompani
Luca Benini
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
MQ
68
12
0
17 Jun 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Zhengqi He
Zeke Xie
Quanzhi Zhu
Zengchang Qin
146
28
0
17 Jun 2022
PRANC: Pseudo RAndom Networks for Compacting deep models
Parsa Nooralinejad
Ali Abbasi
Soroush Abbasi Koohpayegani
Kossar Pourahmadi Meibodi
Rana Muhammad Shahroz Khan
Soheil Kolouri
Hamed Pirsiavash
DD
104
0
0
16 Jun 2022
Asymptotic Soft Cluster Pruning for Deep Neural Networks
Tao Niu
Yinglei Teng
Panpan Zou
28
2
0
16 Jun 2022
"Understanding Robustness Lottery": A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches
Zhimin Li
Shusen Liu
Xin Yu
Kailkhura Bhavya
Jie Cao
Diffenderfer James Daniel
P. Bremer
Valerio Pascucci
AAML
98
1
0
16 Jun 2022
Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness
Tianlong Chen
Huan Zhang
Zhenyu Zhang
Shiyu Chang
Sijia Liu
Pin-Yu Chen
Zhangyang Wang
AAML
63
11
0
15 Jun 2022
Structured Sparsity Learning for Efficient Video Super-Resolution
Bin Xia
Jingwen He
Yulun Zhang
Yitong Wang
Yapeng Tian
Wenming Yang
Luc Van Gool
57
21
0
15 Jun 2022
QONNX: Representing Arbitrary-Precision Quantized Neural Networks
Alessandro Pappalardo
Yaman Umuroglu
Michaela Blott
Jovan Mitrevski
B. Hawks
...
J. Muhizi
Matthew Trahms
Shih-Chieh Hsu
Scott Hauck
Javier Mauricio Duarte
MQ
41
18
0
15 Jun 2022
Hardening DNNs against Transfer Attacks during Network Compression using Greedy Adversarial Pruning
Jonah O'Brien Weiss
Tiago A. O. Alves
S. Kundu
AAML
33
0
0
15 Jun 2022
Energy Consumption Analysis of pruned Semantic Segmentation Networks on an Embedded GPU
Hugo Tessier
Vincent Gripon
Mathieu Léonardon
M. Arzel
David Bertrand
T. Hannagan
GNN
SSeg
3DPC
70
2
0
13 Jun 2022
Leveraging Structured Pruning of Convolutional Neural Networks
Hugo Tessier
Vincent Gripon
Mathieu Léonardon
M. Arzel
David Bertrand
T. Hannagan
CVBM
63
1
0
13 Jun 2022
A Directed-Evolution Method for Sparsification and Compression of Neural Networks with Application to Object Identification and Segmentation and considerations of optimal quantization using small number of bits
L. Franca-Neto
28
0
0
12 Jun 2022
A Theoretical Understanding of Neural Network Compression from Sparse Linear Approximation
Wenjing Yang
G. Wang
Jie Ding
Yuhong Yang
MLT
71
7
0
11 Jun 2022
Data-Efficient Double-Win Lottery Tickets from Robust Pre-training
Tianlong Chen
Zhenyu Zhang
Sijia Liu
Yang Zhang
Shiyu Chang
Zhangyang Wang
AAML
79
8
0
09 Jun 2022
DiSparse: Disentangled Sparsification for Multitask Model Compression
Xing Sun
Ali Hassani
Zhangyang Wang
Gao Huang
Humphrey Shi
137
21
0
09 Jun 2022
Swan: A Neural Engine for Efficient DNN Training on Smartphone SoCs
Sanjay Sri Vallabh Singapuram
Fan Lai
Chuheng Hu
Mosharaf Chowdhury
74
5
0
09 Jun 2022
Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning
Ziqi Zhou
Li Lian
Yilong Yin
Ze Wang
41
1
0
07 Jun 2022
Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm
Aidan Good
Jia-Huei Lin
Hannah Sieg
Mikey Ferguson
Xin Yu
Shandian Zhe
J. Wieczorek
Thiago Serra
108
11
0
07 Jun 2022
Compilation and Optimizations for Efficient Machine Learning on Embedded Systems
Xiaofan Zhang
Yao Chen
Cong Hao
Sitao Huang
Yuhong Li
Deming Chen
84
1
0
06 Jun 2022
GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm
Yanfei Li
Tong Geng
S. Stein
Ang Li
Hui-Ling Yu
MQ
80
8
0
05 Jun 2022
Dynamic Kernel Selection for Improved Generalization and Memory Efficiency in Meta-learning
Arnav Chavan
Rishabh Tiwari
Udbhav Bamba
D. K. Gupta
81
5
0
03 Jun 2022
Canonical convolutional neural networks
Lokesh Veeramacheneni
Moritz Wolter
Reinhard Klein
Jochen Garcke
62
4
0
03 Jun 2022
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
AI4CE
144
13
0
02 Jun 2022
DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
Y. Fu
Haichuan Yang
Jiayi Yuan
Meng Li
Cheng Wan
Raghuraman Krishnamoorthi
Vikas Chandra
Yingyan Lin
149
19
0
02 Jun 2022
Distributed Training for Deep Learning Models On An Edge Computing Network Using ShieldedReinforcement Learning
Tanmoy Sen
Haiying Shen
OffRL
84
5
0
01 Jun 2022
Rotate the ReLU to implicitly sparsify deep networks
Nancy Nayak
Sheetal Kalyani
29
0
0
01 Jun 2022
Bayesian Learning to Discover Mathematical Operations in Governing Equations of Dynamic Systems
Hongpeng Zhou
W. Pan
39
4
0
01 Jun 2022
ORC: Network Group-based Knowledge Distillation using Online Role Change
Jun-woo Choi
Hyeon Cho
Seockhwa Jeong
Wonjun Hwang
48
3
0
01 Jun 2022
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training
Lu Yin
Vlado Menkovski
Meng Fang
Tianjin Huang
Yulong Pei
Mykola Pechenizkiy
Decebal Constantin Mocanu
Shiwei Liu
128
8
0
30 May 2022
STN: Scalable Tensorizing Networks via Structure-Aware Training and Adaptive Compression
Chang Nie
Haiquan Wang
Lu Zhao
44
0
0
30 May 2022
A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law
Chen Li
Antonios Tsourdos
Weisi Guo
AI4CE
61
3
0
30 May 2022
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
Y. Tan
Pihe Hu
L. Pan
Jiatai Huang
Longbo Huang
OffRL
82
25
0
30 May 2022
Deep Learning Methods for Fingerprint-Based Indoor Positioning: A Review
Fahad Al-homayani
Mohammad H. Mahoor
89
66
0
30 May 2022
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Han Cai
Junyan Li
Muyan Hu
Chuang Gan
Song Han
107
58
0
29 May 2022
Machine Learning for Microcontroller-Class Hardware: A Review
Swapnil Sayan Saha
S. Sandha
Mani B. Srivastava
111
125
0
29 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
457
2,299
0
27 May 2022
Can Foundation Models Help Us Achieve Perfect Secrecy?
Simran Arora
Christopher Ré
FedML
92
8
0
27 May 2022
A Low Memory Footprint Quantized Neural Network for Depth Completion of Very Sparse Time-of-Flight Depth Maps
Xiao-Yan Jiang
V. Cambareri
Gianluca Agresti
C. Ugwu
Adriano Simonetto
Fabien Cardinaux
Pietro Zanuttigh
3DV
MQ
77
9
0
25 May 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
116
20
0
25 May 2022
Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free
Tianlong Chen
Zhenyu Zhang
Yihua Zhang
Shiyu Chang
Sijia Liu
Zhangyang Wang
AAML
80
25
0
24 May 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
Peng Hu
Xi Peng
Erik Cambria
M. Aly
Jie Lin
MQ
108
62
0
23 May 2022
Previous
1
2
3
...
21
22
23
...
68
69
70
Next