EIE: Efficient Inference Engine on Compressed Deep Neural Network

4 February 2016

Song Han

Papers citing "EIE: Efficient Inference Engine on Compressed Deep Neural Network"

50 / 325 papers shown

Title
Improving the Efficiency of Transformers for Resource-Constrained Devices Hamid Tabani Ajay Balasubramaniam Shabbir Marzban Elahe Arani Bahram Zonooz 46 20 0 30 Jun 2021
Layer Folding: Neural Network Depth Reduction using Activation Linearization Amir Ben Dror Niv Zehngut Avraham Raviv E. Artyomov Ran Vitek R. Jevnisek 34 20 0 17 Jun 2021
ATRIA: A Bit-Parallel Stochastic Arithmetic Based Accelerator for In-DRAM CNN Processing Supreeth Mysore Shivanandamurthy Ishan G. Thakkar S. A. Salehi 20 5 0 26 May 2021
Dual-side Sparse Tensor Core Yang-Feng Wang Chen Zhang Zhiqiang Xie Cong Guo Yunxin Liu Jingwen Leng 30 75 0 20 May 2021
VersaGNN: a Versatile accelerator for Graph neural networks Feng Shi Yiqiao Jin Song-Chun Zhu GNN 58 17 0 04 May 2021
Piracy-Resistant DNN Watermarking by Block-Wise Image Transformation with Secret Key Maungmaung Aprilpyone Hitoshi Kiya 27 17 0 09 Apr 2021
SETGAN: Scale and Energy Trade-off GANs for Image Applications on Mobile Platforms Nitthilan Kanappan Jayakodi J. Doppa P. Pande GAN 36 4 0 23 Mar 2021
Extending Sparse Tensor Accelerators to Support Multiple Compression Formats Eric Qin Geonhwa Jeong William Won Sheng-Chun Kao Hyoukjun Kwon Sudarshan Srinivasan Dipankar Das G. Moon S. Rajamanickam T. Krishna 35 18 0 18 Mar 2021
unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights Generation Stylianos I. Venieris Javier Fernandez-Marques Nicholas D. Lane 29 11 0 09 Mar 2021
Knowledge Evolution in Neural Networks Ahmed Taha Abhinav Shrivastava L. Davis 51 21 0 09 Mar 2021
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search Kartik Hegde Po-An Tsai Sitao Huang Vikas Chandra A. Parashar Christopher W. Fletcher 26 93 0 02 Mar 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey Tailin Liang C. Glossner Lei Wang Shaobo Shi Xiaotong Zhang MQ 150 678 0 24 Jan 2021
Direct Spatial Implementation of Sparse Matrix Multipliers for Reservoir Computing Matthew Denton H. Schmit 16 2 0 21 Jan 2021
BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification Seyed Abolfazl Ghasemzadeh E. Tavakoli M. Kamal A. Afzali-Kusha Massoud Pedram 24 13 0 07 Jan 2021
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead Maurizio Capra Beatrice Bussolino Alberto Marchisio Guido Masera Maurizio Martina Mohamed Bennai BDL 64 140 0 21 Dec 2020
FantastIC4: A Hardware-Software Co-Design Approach for Efficiently Running 4bit-Compact Multilayer Perceptrons Simon Wiedemann Suhas Shivapakash P. Wiedemann Daniel Becking Wojciech Samek F. Gerfers Thomas Wiegand MQ 28 7 0 17 Dec 2020
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning Hanrui Wang Zhekai Zhang Song Han 48 380 0 17 Dec 2020
Robustness and Transferability of Universal Attacks on Compressed Models Alberto G. Matachana Kenneth T. Co Luis Muñoz-González David Martínez Emil C. Lupu AAML 29 10 0 10 Dec 2020
The Why, What and How of Artificial General Intelligence Chip Development Alex P. James 27 20 0 08 Dec 2020
Bringing AI To Edge: From Deep Learning's Perspective Di Liu Hao Kong Xiangzhong Luo Weichen Liu Ravi Subramaniam 57 117 0 25 Nov 2020
In-Memory Nearest Neighbor Search with FeFET Multi-Bit Content-Addressable Memories Arman Kazemi M. Sharifi Ann Franchesca Laguna F. Müller R. Rajaei R. Olivo T. Kämpfe Michael Niemier X. S. Hu MQ 13 37 0 13 Nov 2020
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference Yujeong Choi Yunseong Kim Minsoo Rhu 24 66 0 25 Oct 2020
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training Youngeun Kwon Yunjae Lee Minsoo Rhu 27 40 0 25 Oct 2020
FPRaker: A Processing Element For Accelerating Neural Network Training Omar Mohamed Awad Mostafa Mahmoud Isak Edo Vivancos Ali Hadi Zadeh Ciaran Bannon Anand Jayarajan Gennady Pekhimenko Andreas Moshovos 28 15 0 15 Oct 2020
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win Utku Evci Yani Andrew Ioannou Cem Keskin Yann N. Dauphin 42 87 0 07 Oct 2020
GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep Learning Vasisht Duddu A. Boutet Virat Shejwalkar GNN 24 4 0 02 Oct 2020
Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training Dingqing Yang Amin Ghasemazar X. Ren Maximilian Golub G. Lemieux Mieszko Lis 22 48 0 23 Sep 2020
Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning Bingbing Li Zhenglun Kong Tianyun Zhang Ji Li Zechao Li Hang Liu Caiwen Ding VLM 32 64 0 17 Sep 2020
MSP: An FPGA-Specific Mixed-Scheme, Multi-Precision Deep Neural Network Quantization Framework Sung-En Chang Yanyu Li Mengshu Sun Weiwen Jiang Runbin Shi Xue Lin Yanzhi Wang MQ 27 7 0 16 Sep 2020
OrthoReg: Robust Network Pruning Using Orthonormality Regularization Ekdeep Singh Lubana Puja Trivedi C. Hougen Robert P. Dick Alfred Hero 37 1 0 10 Sep 2020
Layer-specific Optimization for Mixed Data Flow with Mixed Precision in FPGA Design for CNN-based Object Detectors Duy-Thanh Nguyen Hyun Kim Hyuk-Jae Lee MQ 25 60 0 03 Sep 2020
Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity Cong Guo B. Hsueh Jingwen Leng Yuxian Qiu Yue Guan Zehuan Wang Xiaoying Jia Xipeng Li Minyi Guo Yuhao Zhu 35 83 0 29 Aug 2020
CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity Edge Devices Parth Mannan A. Samajdar T. Krishna 31 2 0 27 Aug 2020
SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference Ziheng Wang 40 67 0 26 Aug 2020
Training Sparse Neural Networks using Compressed Sensing Jonathan W. Siegel Jianhong Chen Pengchuan Zhang Jinchao Xu 28 5 0 21 Aug 2020
Artificial Neural Networks and Fault Injection Attacks Shahin Tajik F. Ganji SILM 13 10 0 17 Aug 2020
Compression of Deep Learning Models for Text: A Survey Manish Gupta Puneet Agrawal VLM MedIm AI4CE 22 115 0 12 Aug 2020
Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node Alfio Di Mauro Francesco Conti Pasquale Davide Schiavone D. Rossi Luca Benini 26 9 0 17 Jul 2020
AQD: Towards Accurate Fully-Quantized Object Detection Peng Chen Jing Liu Bohan Zhuang Mingkui Tan Chunhua Shen MQ 34 10 0 14 Jul 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights Shail Dave Riyadh Baghdadi Tony Nowatzki Sasikanth Avancha Aviral Shrivastava Baoxin Li 64 82 0 02 Jul 2020
Extension of Direct Feedback Alignment to Convolutional and Recurrent Neural Network for Bio-plausible Deep Learning Donghyeon Han Gwangtae Park Junha Ryu H. Yoo 3DV 20 5 0 23 Jun 2020
AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles Sicong Liu Junzhao Du Kaiming Nan Zimu Zhou Zhangyang Wang Yingyan Lin 32 30 0 08 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks Ilia Shumailov Yiren Zhao Daniel Bates Nicolas Papernot Robert D. Mullins Ross J. Anderson SILM 19 128 0 05 Jun 2020
TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids Igor Fedorov Marko Stamenovic Carl R. Jensen Li-Chia Yang Ari Mandell Yiming Gan Matthew Mattina P. Whatmough 16 96 0 20 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning Victor Sanh Thomas Wolf Alexander M. Rush 32 472 0 15 May 2020
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference Ali Hadi Zadeh Isak Edo Omar Mohamed Awad Andreas Moshovos MQ 32 185 0 08 May 2020
SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation Yang Zhao Xiaohan Chen Yue Wang Chaojian Li Haoran You Y. Fu Yuan Xie Zhangyang Wang Yingyan Lin MQ 40 43 0 07 May 2020
Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture Christopher Brix Parnia Bahar Hermann Ney 16 38 0 04 May 2020
TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators Towards Local and in Time Domain Weitao Li Pengfei Xu Yang Zhao Haitong Li Yuan Xie Yingyan Lin 17 69 0 03 May 2020
Lupulus: A Flexible Hardware Accelerator for Neural Networks Andreas Toftegaard Kristensen R. Giterman Alexios Balatsoukas-Stimming A. Burg 36 0 0 03 May 2020