EIE: Efficient Inference Engine on Compressed Deep Neural Network

4 February 2016

Song Han

Papers citing "EIE: Efficient Inference Engine on Compressed Deep Neural Network"

50 / 325 papers shown

Title
Computation on Sparse Neural Networks: an Inspiration for Future Hardware Fei Sun Minghai Qin Tianyun Zhang Liu Liu Yen-kuang Chen Yuan Xie 42 7 0 24 Apr 2020
PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal Matrices Chunhua Deng Siyu Liao Yi Xie Keshab K. Parhi Xuehai Qian Bo Yuan 46 93 0 23 Apr 2020
HCM: Hardware-Aware Complexity Metric for Neural Network Architectures Alex Karbachevsky Chaim Baskin Evgenii Zheltonozhskii Yevgeny Yermolin F. Gabbay A. Bronstein A. Mendelson 40 11 0 19 Apr 2020
Bit-Parallel Vector Composability for Neural Acceleration Soroush Ghodrati Hardik Sharma C. Young Nam Sung Kim H. Esmaeilzadeh MQ 14 20 0 11 Apr 2020
Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training Simon Wiedemann Temesgen Mehari Kevin Kepp Wojciech Samek 32 18 0 09 Apr 2020
Reducing Data Motion to Accelerate the Training of Deep Neural Networks Sicong Zhuang Cristiano Malossi Marc Casas 27 0 0 05 Apr 2020
Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets D. Haase Manuel Amthor 20 132 0 30 Mar 2020
Data-Driven Neuromorphic DRAM-based CNN and RNN Accelerators T. Delbruck Shih-Chii Liu 27 4 0 29 Mar 2020
DP-Net: Dynamic Programming Guided Deep Neural Network Compression Dingcheng Yang Wenjian Yu Ao Zhou Haoyuan Mu G. Yao Xiaoyi Wang 21 6 0 21 Mar 2020
Cluster Pruning: An Efficient Filter Pruning Method for Edge AI Vision Applications Chinthaka Gamanayake Lahiru Jayasinghe Benny Kai Kiat Ng Chau Yuen VLM 28 46 0 05 Mar 2020
Comparing Rewinding and Fine-tuning in Neural Network Pruning Alex Renda Jonathan Frankle Michael Carbin 237 383 0 05 Mar 2020
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices Byung Hoon Ahn Jinwon Lee J. Lin Hsin-Pai Cheng Jilei Hou H. Esmaeilzadeh 76 55 0 04 Mar 2020
DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures Yang Zhao Chaojian Li Yue Wang Pengfei Xu Yongan Zhang Yingyan Lin 27 41 0 26 Feb 2020
HRank: Filter Pruning using High-Rank Feature Map Mingbao Lin Rongrong Ji Yan Wang Yichen Zhang Baochang Zhang Yonghong Tian Ling Shao 13 717 0 24 Feb 2020
A $^3$ : Accelerating Attention Mechanisms in Neural Networks with Approximation Tae Jun Ham Sungjun Jung Seonghak Kim Young H. Oh Yeonhong Park ... Jung-Hun Park Sanghee Lee Kyoung Park Jae W. Lee D. Jeong 24 214 0 22 Feb 2020
Taurus: A Data Plane Architecture for Per-Packet ML Tushar Swamy Alexander Rucker M. Shahbaz Ishan Gaur K. Olukotun 23 82 0 12 Feb 2020
PCNN: Pattern-based Fine-Grained Regular Pruning towards Optimizing CNN Accelerators Zhanhong Tan Jiebo Song Xiaolong Ma S. Tan Hongyang Chen ... Yifu Wu Shaokai Ye Yanzhi Wang Dehui Li Kaisheng Ma 38 24 0 11 Feb 2020
PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning Dor Livne Kobi Cohen 34 50 0 14 Jan 2020
Least squares binary quantization of neural networks Hadi Pouransari Zhucheng Tu Oncel Tuzel MQ 17 32 0 09 Jan 2020
DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference Udit Gupta Samuel Hsia V. Saraph Xiaodong Wang Brandon Reagen Gu-Yeon Wei Hsien-Hsin S. Lee David Brooks Carole-Jean Wu GNN 41 188 0 08 Jan 2020
Lightweight Residual Densely Connected Convolutional Neural Network Fahimeh Fooladgar S. Kasaei 24 13 0 02 Jan 2020
2L-3W: 2-Level 3-Way Hardware-Software Co-Verification for the Mapping of Deep Learning Architecture (DLA) onto FPGA Boards Tolulope A. Odetola Katie M. Groves S. R. Hasan 29 5 0 14 Nov 2019
Communication Lower Bound in Convolution Accelerators Xiaoming Chen Yinhe Han Yu Wang 26 29 0 08 Nov 2019
MLPerf Inference Benchmark Vijayarāghava Reḍḍī C. Cheng David Kanter Pete H Mattson Guenther Schmuelling ... Bing Yu George Y. Yuan Aaron Zhong P. Zhang Yuchen Zhou 31 489 0 06 Nov 2019
ALERT: Accurate Learning for Energy and Timeliness Chengcheng Wan M. Santriaji E. Rogers H. Hoffmann Michael Maire Shan Lu AI4CE 48 40 0 31 Oct 2019
Deep Learning at the Edge Sahar Voghoei N. Tonekaboni Jason G. Wallace H. Arabnia 21 41 0 22 Oct 2019
Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks Yihui He Jianing Qian Jianren Wang Cindy X. Le Congrui Hetang Qi Lyu Wenping Wang Tianwei Yue 53 11 0 21 Oct 2019
Deep Semantic Segmentation of Natural and Medical Images: A Review Saeid Asgari Taghanaki Kumar Abhishek Joseph Paul Cohen Julien Cohen-Adad Ghassan Hamarneh SSeg VLM 47 668 0 16 Oct 2019
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM Skanda Koppula Lois Orosa A. G. Yaglikçi Roknoddin Azizi Taha Shahroodi Konstantinos Kanellopoulos O. Mutlu 32 105 0 12 Oct 2019
A Pre-defined Sparse Kernel Based Convolution for Deep CNNs Souvik Kundu Saurav Prakash H. Akrami Peter A. Beerel K. Chugg 41 12 0 02 Oct 2019
REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs Caiwen Ding Shuo Wang Ning Liu Kaidi Xu Yanzhi Wang Yun Liang MQ 24 89 0 29 Sep 2019
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution Taojiannan Yang Sijie Zhu Chong Chen Shen Yan Mi Zhang Andrew Willis OOD 25 74 0 27 Sep 2019
Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator Tian Zhao Yaqi Zhang K. Olukotun 33 16 0 26 Sep 2019
DASNet: Dynamic Activation Sparsity for Neural Network Efficiency Improvement Qing Yang Jiachen Mao Zuoguan Wang H. Li 21 15 0 13 Sep 2019
Cost-Driven Offloading for DNN-based Applications over Cloud, Edge and End Devices Bing Lin Yinhao Huang Jianshan Zhang Junqin Hu Xing Chen Jun Li 22 136 0 31 Jul 2019
Recurrent Neural Networks: An Embedded Computing Perspective Nesma M. Rezk M. Purnaprajna Tomas Nordstrom Z. Ul-Abdin 48 81 0 23 Jul 2019
Similarity-Preserving Knowledge Distillation Frederick Tung Greg Mori 48 961 0 23 Jul 2019
Convergence of Edge Computing and Deep Learning: A Comprehensive Survey Xiaofei Wang Yiwen Han Victor C. M. Leung Dusit Niyato Xueqiang Yan Xu Chen 24 978 0 19 Jul 2019
ViP: Virtual Pooling for Accelerating CNN-based Image Classification and Object Detection Zhuo Chen Jiyuan Zhang Ruizhou Ding Diana Marculescu 13 12 0 19 Jun 2019
Effectiveness of Distillation Attack and Countermeasure on Neural Network Watermarking Ziqi Yang Hung Dang E. Chang AAML 27 34 0 14 Jun 2019
The Architectural Implications of Facebook's DNN-based Personalized Recommendation Udit Gupta Carole-Jean Wu Xiaodong Wang Maxim Naumov Brandon Reagen ... Andrey Malevich Dheevatsa Mudigere M. Smelyanskiy Liang Xiong Xuan Zhang GNN 44 290 0 06 Jun 2019
OpenEI: An Open Framework for Edge Intelligence Xingzhou Zhang Yifan Wang Sidi Lu Liangkai Liu Lanyu Xu Weisong Shi 34 101 0 05 Jun 2019
DeepShift: Towards Multiplication-Less Neural Networks Mostafa Elhoushi Zihao Chen F. Shafiq Ye Tian Joey Yiwei Li MQ 38 97 0 30 May 2019
SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers Igor Fedorov Ryan P. Adams Matthew Mattina P. Whatmough 23 166 0 28 May 2019
Structured Compression by Weight Encryption for Unstructured Pruning and Quantization S. Kwon Dongsoo Lee Byeongwook Kim Parichay Kapoor Baeseong Park Gu-Yeon Wei MQ 35 49 0 24 May 2019
Pruning-Aware Merging for Efficient Multitask Inference Xiaoxi He Dawei Gao Zimu Zhou Yongxin Tong Lothar Thiele MoMe 37 8 0 23 May 2019
Dynamic Neural Network Channel Execution for Efficient Training Simeon E. Spasov Pietro Lio 19 4 0 15 May 2019
Towards Optimal Structured CNN Pruning via Generative Adversarial Learning Shaohui Lin Rongrong Ji Chenqian Yan Baochang Zhang Liujuan Cao QiXiang Ye Feiyue Huang David Doermann CVBM 22 505 0 22 Mar 2019
Deep Learning on Mobile Devices - A Review Yunbin Deng 27 120 0 21 Mar 2019
Improving Device-Edge Cooperative Inference of Deep Learning via 2-Step Pruning Wenqi Shi Yunzhong Hou Sheng Zhou Z. Niu Yang Zhang Lu Geng 27 83 0 08 Mar 2019