v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015

Song Han

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown

Title
Deep CNNs for Peripheral Blood Cell Classification Ekta Gavas Kaustubh Olpadkar 49 9 0 18 Oct 2021
BERMo: What can BERT learn from ELMo? Sangamesh Kodge Kaushik Roy 69 3 0 18 Oct 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention Zhe Zhou Junling Liu Zhenyu Gu Guangyu Sun 153 45 0 18 Oct 2021
A Dimensionality Reduction Approach for Convolutional Neural Networks L. Meneghetti N. Demo G. Rozza 182 14 0 18 Oct 2021
Finding Everything within Random Binary Networks Kartik K. Sreenivasan Shashank Rajput Jy-yong Sohn Dimitris Papailiopoulos 41 10 0 18 Oct 2021
Network Augmentation for Tiny Deep Learning Han Cai Chuang Gan Ji Lin Song Han 133 30 0 17 Oct 2021
Compression-aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations Yu-Shan Tai Chieh-Fang Teng Cheng-Yang Chang A. Wu 43 8 0 17 Oct 2021
S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks Shiyu Liu Chong Min John Tan Mehul Motani CLL 68 4 0 17 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations Xinyu Zhang Ian Colbert Ken Kreutz-Delgado Srinjoy Das MQ 109 12 0 15 Oct 2021
Kronecker Decomposition for GPT Compression Ali Edalati Marzieh S. Tahaei Ahmad Rashid V. Nia J. Clark Mehdi Rezagholizadeh 102 36 0 15 Oct 2021
Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices Tianli Zhao Xi Sheryl Zhang Wentao Zhu Jiaxing Wang Sen Yang Ji Liu Jian Cheng 92 2 0 15 Oct 2021
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization Weihan Chen Peisong Wang Jian Cheng MQ 95 69 0 13 Oct 2021
Improving Binary Neural Networks through Fully Utilizing Latent Weights Weixiang Xu Qiang Chen Xiangyu He Peisong Wang Jian Cheng MQ 74 6 0 12 Oct 2021
Mining the Weights Knowledge for Optimizing Neural Network Structures Mengqiao Han Xiabi Liu Zhaoyang Hai Xin Duan 27 1 0 11 Oct 2021
Visualizing the embedding space to explain the effect of knowledge distillation Hyun Seung Lee C. Wallraven 79 1 0 09 Oct 2021
Designing the Architecture of a Convolutional Neural Network Automatically for Diabetic Retinopathy Diagnosis Fahman Saeed M. Hussain Hatim Aboalsamh Fadwa Al Adel A. Owaifeer 114 6 0 08 Oct 2021
GNN is a Counter? Revisiting GNN for Question Answering Kuan-Chieh Wang Yuyu Zhang Diyi Yang Le Song Tao Qin LMTD 79 31 0 07 Oct 2021
Random matrices in service of ML footprint: ternary random features with no performance loss Hafiz Tiomoko Ali Zhenyu Liao Romain Couillet 87 7 0 05 Oct 2021
Progressive Transmission and Inference of Deep Learning Models Youngsoo Lee Sangdoo Yun Yeonghun Kim Sunghee Choi 54 2 0 03 Oct 2021
One Timestep is All You Need: Training Spiking Neural Networks with Ultra Low Latency Sayeed Shafayet Chowdhury Nitin Rathi Kaushik Roy 87 41 0 01 Oct 2021
Prune Your Model Before Distill It Jinhyuk Park Albert No VLM 132 28 0 30 Sep 2021
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition Marawan Gamal Abdel Hameed Marzieh S. Tahaei A. Mosleh V. Nia 90 26 0 29 Sep 2021
Smart at what cost? Characterising Mobile Deep Neural Networks in the wild Mario Almeida Stefanos Laskaridis Abhinav Mehrotra Łukasz Dudziak Ilias Leontiadis Nicholas D. Lane HAI 161 47 0 28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device Ji Lin Chuang Gan Kuan-Chieh Wang Song Han 108 65 0 27 Sep 2021
Deep Structured Instance Graph for Distilling Object Detectors Yixin Chen Pengguang Chen Shu Liu Liwei Wang Jiaya Jia ObjD ISeg 69 12 0 27 Sep 2021
Distribution-sensitive Information Retention for Accurate Binary Neural Network Haotong Qin Xiangguo Zhang Ruihao Gong Yifu Ding Yi Xu Xianglong Liu MQ 73 98 0 25 Sep 2021
Sextans: A Streaming Accelerator for General-Purpose Sparse-Matrix Dense-Matrix Multiplication Linghao Song Yuze Chi Atefeh Sohrabizadeh Young-kyu Choi Jason Lau Jason Cong GNN 105 62 0 22 Sep 2021
Neural network relief: a pruning algorithm based on neural activity Aleksandr Dekhovich David Tax M. Sluiter Miguel A. Bessa 120 11 0 22 Sep 2021
Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework Mohamed Bennai Alberto Marchisio Rachmad Vidya Wicaksana Putra Muhammad Abdullah Hanif 101 34 0 20 Sep 2021
Learning Versatile Convolution Filters for Efficient Visual Recognition Kai Han Yunhe Wang Chang Xu Chunjing Xu Enhua Wu Dacheng Tao 60 8 0 20 Sep 2021
Comfetch: Federated Learning of Large Networks on Constrained Clients via Sketching Tahseen Rabbani Brandon Yushan Feng Marco Bornstein Kyle Rui Sang Yifan Yang Arjun Rajkumar A. Varshney Furong Huang FedML 119 2 0 17 Sep 2021
RAPID-RL: A Reconfigurable Architecture with Preemptive-Exits for Efficient Deep-Reinforcement Learning Adarsh Kosta Malik Aqeel Anwar Priyadarshini Panda A. Raychowdhury Kaushik Roy 30 4 0 16 Sep 2021
OMPQ: Orthogonal Mixed Precision Quantization Yuexiao Ma Taisong Jin Xiawu Zheng Yan Wang Huixia Li Yongjian Wu Guannan Jiang Wei Zhang Rongrong Ji MQ 132 38 0 16 Sep 2021
Dense Pruning of Pointwise Convolutions in the Frequency Domain Mark Buckler Neil Adit Yuwei Hu Zhiru Zhang Adrian Sampson 3DPC 53 2 0 16 Sep 2021
Complexity-aware Adaptive Training and Inference for Edge-Cloud Distributed AI Systems Yinghan Long I. Chakraborty G. Srinivasan Kaushik Roy 61 15 0 14 Sep 2021
AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance Xiangcheng Liu Jian Cao Hongyi Yao Wenyu Sun Yuan Zhang 68 2 0 14 Sep 2021
KroneckerBERT: Learning Kronecker Decomposition for Pre-trained Language Models via Knowledge Distillation Marzieh S. Tahaei Ella Charlaix V. Nia A. Ghodsi Mehdi Rezagholizadeh 112 22 0 13 Sep 2021
$On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning$ On the Compression of Neural Networks Using $\ell_0$ -Norm Regularization and Weight Pruning F. Oliveira E. Batista R. Seara 75 10 0 10 Sep 2021
SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning Febin P. Sunny Mahdi Nikdast S. Pasricha 80 22 0 09 Sep 2021
$ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs$ ECQ $^{\text{x}}$ : Explainability-Driven Quantization for Low-Bit and Sparse DNNs Daniel Becking Maximilian Dreyer Wojciech Samek Karsten Müller Sebastian Lapuschkin MQ 358 16 0 09 Sep 2021
SensiX++: Bringing MLOPs and Multi-tenant Model Serving to Sensory Edge Devices Chulhong Min Akhil Mathur Utku Günay Acer A. Montanari F. Kawsar 69 12 0 08 Sep 2021
Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks Cheng Gong Ye Lu Kunpeng Xie Zongming Jin Tao Li Yanzhi Wang MQ 66 7 0 08 Sep 2021
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables B. Prabakaran Asima Akhtar Semeen Rehman Osman Hasan Mohamed Bennai 33 10 0 07 Sep 2021
GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization Yi Guo Huan Yuan Jianchao Tan Zhangyang Wang Sen Yang Ji Liu 94 46 0 06 Sep 2021
On the Accuracy of Analog Neural Network Inference Accelerators T. Xiao Ben Feinberg C. Bennett V. Prabhakar Prashant Saxena V. Agrawal S. Agarwal M. Marinella 69 41 0 03 Sep 2021
Architecture Aware Latency Constrained Sparse Neural Networks Tianli Zhao Qinghao Hu Xiangyu He Weixiang Xu Jiaxing Wang Cong Leng Jian Cheng 76 0 0 01 Sep 2021
Quantized Convolutional Neural Networks Through the Lens of Partial Differential Equations Ido Ben-Yair Gil Ben Shalom Moshe Eliasof Eran Treister MQ 93 5 0 31 Aug 2021
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions Yang Wu Dingheng Wang Xiaotong Lu Fan Yang Guoqi Li W. Dong Jianbo Shi 113 18 0 30 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI Amin Banitalebi-Dehkordi Naveen Vedula J. Pei Fei Xia Lanjun Wang Yong Zhang 81 92 0 30 Aug 2021
Communication-Computation Efficient Device-Edge Co-Inference via AutoML Xinjie Zhang Jiawei Shao Yuyi Mao Jun Zhang 66 8 0 30 Aug 2021