v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015

Song Han

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown

Title
Automatic Rank Selection for High-Speed Convolutional Neural Network Hyeji Kim C. Kyung 61 5 0 28 Jun 2018
DeepObfuscation: Securing the Structure of Convolutional Neural Networks via Knowledge Distillation Hui Xu Yuxin Su Zirui Zhao Yangfan Zhou Michael R. Lyu Irwin King FedML 67 27 0 27 Jun 2018
Deep $k$ -Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions Junru Wu Yue Wang Zhenyu Wu Zhangyang Wang Ashok Veeraraghavan Yingyan Lin 90 115 0 24 Jun 2018
Constructing Deep Neural Networks by Bayesian Network Structure Learning R. Y. Rohekar Shami Nisimov Yaniv Gurwicz G. Koren Gal Novik BDL 151 26 0 24 Jun 2018
Compact Deep Neural Networks for Computationally Efficient Gesture Classification From Electromyography Signals A. Hartwell V. Kadirkamanathan S. Anderson 13 17 0 22 Jun 2018
Deploying Deep Neural Networks in the Embedded Space Stylianos I. Venieris Alexandros Kouris C. Bouganis 80 13 0 22 Jun 2018
Efficient Semantic Segmentation using Gradual Grouping Nikitha Vallurupalli Sriharsha Annamaneni G. Varma C. V. Jawahar Manu Mathew S. Nagori SSeg 76 12 0 22 Jun 2018
Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations Ting-Li Chen Martin Renqiang Min Yizhou Sun 77 71 0 21 Jun 2018
Quantizing deep convolutional networks for efficient inference: A whitepaper Raghuraman Krishnamoorthi MQ 145 1,026 0 21 Jun 2018
Rethinking Machine Learning Development and Deployment for Edge Devices Liangzhen Lai Naveen Suda 53 10 0 20 Jun 2018
Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy En Li Zhi Zhou Xu Chen 89 332 0 20 Jun 2018
Doubly Nested Network for Resource-Efficient Inference Jaehong Kim Sungeun Hong Yongseok Choi Jiwon Kim 45 5 0 20 Jun 2018
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? Shilin Zhu Xin Dong Hao Su MQ 115 138 0 20 Jun 2018
GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking Patrick H. Chen Si Si Yang Li Ciprian Chelba Cho-Jui Hsieh 69 70 0 18 Jun 2018
Fast Convex Pruning of Deep Neural Networks Alireza Aghasi Afshin Abdi Justin Romberg 73 24 0 17 Jun 2018
On Machine Learning and Structure for Mobile Robots Markus Wulfmeier 51 6 0 15 Jun 2018
Three dimensional Deep Learning approach for remote sensing image classification A. Ben Hamida A. Benoît P. Lambert C. Ben Amar 105 588 0 15 Jun 2018
RAPIDNN: In-Memory Deep Neural Network Acceleration Framework Mohsen Imani Mohammad Samragh Yeseong Kim Saransh Gupta F. Koushanfar Tajana Simunic 85 51 0 15 Jun 2018
Deep Learning Approximation: Zero-Shot Neural Network Speedup Michele Pratusevich 39 0 0 15 Jun 2018
Insights on representational similarity in neural networks with canonical correlation Ari S. Morcos M. Raghu Samy Bengio DRL 149 447 0 14 Jun 2018
PCAS: Pruning Channels with Attention Statistics for Deep Network Compression Kohei Yamamoto K. Maeno 68 33 0 14 Jun 2018
Scalable Neural Network Compression and Pruning Using Hard Clustering and L1 Regularization Yibo Yang Nicholas Ruozzi Vibhav Gogate 33 2 0 14 Jun 2018
The streaming rollout of deep networks - towards fully model-parallel execution Volker Fischer Jan M. Köhler Thomas Pfeil 71 16 0 13 Jun 2018
Knowledge Distillation by On-the-Fly Native Ensemble Xu Lan Xiatian Zhu S. Gong 306 483 0 12 Jun 2018
Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking Haichuan Yang Yuhao Zhu Ji Liu CVBM 116 36 0 12 Jun 2018
Full deep neural network training on a pruned weight budget Maximilian Golub G. Lemieux Mieszko Lis 94 28 0 11 Jun 2018
Smallify: Learning Network Size while Training Guillaume Leclerc Manasi Vartak Raul Castro Fernandez Tim Kraska Samuel Madden 61 13 0 10 Jun 2018
TAPAS: Tricks to Accelerate (encrypted) Prediction As a Service Amartya Sanyal Matt J. Kusner Adria Gascon Varun Kanade FedML 74 127 0 09 Jun 2018
Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Florian Tramèr Dan Boneh FedML 197 404 0 08 Jun 2018
EasyConvPooling: Random Pooling with Easy Convolution for Accelerating Training and Testing Jianzhong Sheng Chuanbo Chen Chenchen Fu Chun Jason Xue 87 5 0 05 Jun 2018
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark Cody Coleman Daniel Kang Deepak Narayanan Luigi Nardi Tian Zhao Jian Zhang Peter Bailis K. Olukotun Christopher Ré Matei A. Zaharia 73 117 0 04 Jun 2018
Dynamically Hierarchy Revolution: DirNet for Compressing Recurrent Neural Network on Mobile Devices Jie Zhang Xiaolong Wang Dawei Li Yalin Wang 83 14 0 04 Jun 2018
Targeted Kernel Networks: Faster Convolutions with Attentive Regularization Kashyap Chitta 30 2 0 01 Jun 2018
IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks Ke Sun Mingjie Li Dong Liu Jingdong Wang 138 126 0 01 Jun 2018
A Highly Parallel FPGA Implementation of Sparse Neural Network Training Sourya Dey Diandian Chen Zongyang Li Souvik Kundu Kuan-Wen Huang K. Chugg Peter A. Beerel 75 11 0 31 May 2018
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks Kang Liu Brendan Dolan-Gavitt S. Garg AAML 104 1,055 0 30 May 2018
Channel Gating Neural Networks Weizhe Hua Yuan Zhou Christopher De Sa Zhiru Zhang G. E. Suh 70 180 0 29 May 2018
A novel channel pruning method for deep neural network compression Yiming Hu Siyang Sun Jianquan Li Xingang Wang Qingyi Gu 75 67 0 29 May 2018
Retraining-Based Iterative Weight Quantization for Deep Neural Networks Dongsoo Lee Byeongwook Kim MQ 84 16 0 29 May 2018
Adaptive Network Sparsification with Dependent Variational Beta-Bernoulli Dropout Juho Lee Saehoon Kim Jaehong Yoon Haebeom Lee Eunho Yang Sung Ju Hwang 55 12 0 28 May 2018
Constructing Fast Network through Deconstruction of Convolution Yunho Jeon Junmo Kim 113 72 0 28 May 2018
Compact and Computationally Efficient Representation of Deep Neural Networks Simon Wiedemann K. Müller Wojciech Samek MQ 96 71 0 27 May 2018
Accelerating CNN inference on FPGAs: A Survey K. Abdelouahab Maxime Pelcat Jocelyn Serot F. Berry AI4CE 53 149 0 26 May 2018
Heterogeneous Bitwidth Binarization in Convolutional Neural Networks Josh Fromm Shwetak N. Patel Matthai Philipose MQ 87 27 0 25 May 2018
Tensorial Neural Networks: Generalization of Neural Networks and Application to Model Compression Jiahao Su Jingling Li Bobby Bhattacharjee Furong Huang 82 20 0 25 May 2018
Scalable Methods for 8-bit Training of Neural Networks Ron Banner Itay Hubara Elad Hoffer Daniel Soudry MQ 94 342 0 25 May 2018
Multi-Task Zipping via Layer-wise Neuron Sharing Xiaoxi He Zimu Zhou Lothar Thiele MoMe 66 64 0 24 May 2018
Learning towards Minimum Hyperspherical Energy Weiyang Liu Rongmei Lin Ziqiang Liu Lixin Liu Zhiding Yu Bo Dai Le Song 165 151 0 23 May 2018
AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference Jian-Hao Luo Jianxin Wu 98 211 0 23 May 2018
Approximate Random Dropout Zhuoran Song Ru Wang Dongyu Ru Hongru Huang Zhenghao Peng Hai Zhao Xiaoyao Liang Li Jiang BDL 44 9 0 23 May 2018