Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown

Title
Lightweight Residual Densely Connected Convolutional Neural Network Fahimeh Fooladgar S. Kasaei 57 13 0 02 Jan 2020
ZeroQ: A Novel Zero Shot Quantization Framework Yaohui Cai Z. Yao Zhen Dong A. Gholami Michael W. Mahoney Kurt Keutzer MQ 127 400 0 01 Jan 2020
Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection Tianshu Chu Qin Luo Jie Yang Xiaolin Huang MQ 40 6 0 29 Dec 2019
Towards Unified INT8 Training for Convolutional Neural Network Feng Zhu Ruihao Gong F. Yu Xianglong Liu Yanfei Wang Zhelong Li Xiuqi Yang Junjie Yan MQ 97 152 0 29 Dec 2019
Towards Efficient Training for Neural Network Quantization Qing Jin Linjie Yang Zhenyu A. Liao MQ 110 42 0 21 Dec 2019
AdaBits: Neural Network Quantization with Adaptive Bit-Widths Qing Jin Linjie Yang Zhenyu A. Liao MQ 93 124 0 20 Dec 2019
Predicting detection filters for small footprint open-vocabulary keyword spotting Théodore Bluche Thibault Gisselbrecht ObjD 131 19 0 16 Dec 2019
The Knowledge Within: Methods for Data-Free Model Compression Matan Haroush Itay Hubara Elad Hoffer Daniel Soudry 82 110 0 03 Dec 2019
ReD-CaNe: A Systematic Methodology for Resilience Analysis and Design of Capsule Networks under Approximations Alberto Marchisio Vojtěch Mrázek Muhammad Abdullah Hanif Mohamed Bennai AAML 56 15 0 02 Dec 2019
Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization J. H. Lee Jihun Yun Sung Ju Hwang Eunho Yang MQ 24 0 0 29 Nov 2019
GhostNet: More Features from Cheap Operations Kai Han Yunhe Wang Qi Tian Jianyuan Guo Chunjing Xu Chang Xu 114 2,717 0 27 Nov 2019
Structured Multi-Hashing for Model Compression Elad Eban Yair Movshovitz-Attias Hao Wu Mark Sandler Andrew Poon Yerlan Idelbayev M. A. Carreira-Perpiñán 78 18 0 25 Nov 2019
Quantization Networks Jiwei Yang Xu Shen Jun Xing Xinmei Tian Houqiang Li Bing Deng Jianqiang Huang Xiansheng Hua MQ 88 351 0 21 Nov 2019
REVAMP $^2$ T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking Christopher Neff Matías Mendieta Shrey Mohan Mohammadreza Baharani Samuel Rogers Hamed Tabkhi 49 56 0 20 Nov 2019
CUP: Cluster Pruning for Compressing Deep Neural Networks Rahul Duggal Cao Xiao R. Vuduc Jimeng Sun 3DPC VLM 48 23 0 19 Nov 2019
Distributed Low Precision Training Without Mixed Precision Zehua Cheng Weiyan Wang Yan Pan Thomas Lukasiewicz MQ 52 5 0 18 Nov 2019
Selective sampling for accelerating training of deep neural networks Berry Weinstein Shai Fine Y. Hel-Or 23 3 0 16 Nov 2019
DupNet: Towards Very Tiny Quantized CNN with Improved Accuracy for Face Detection Hongxing Gao Wei Tao Dongchao Wen Junjie Liu Tse-Wei Chen Kinya Osa Masami Kato CVBM 36 5 0 13 Nov 2019
Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation Junjie Liu Dongchao Wen Hongxing Gao Wei Tao Tse-Wei Chen Kinya Osa Masami Kato 81 21 0 13 Nov 2019
What Do Compressed Deep Neural Networks Forget? Sara Hooker Aaron Courville Gregory Clark Yann N. Dauphin Andrea Frome 118 185 0 13 Nov 2019
Scientific Image Restoration Anywhere V. Abeykoon Zhengchun Liu R. Kettimuthu Geoffrey C. Fox Ian Foster 67 19 0 12 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks Zhen Dong Z. Yao Yaohui Cai Daiyaan Arfeen A. Gholami Michael W. Mahoney Kurt Keutzer MQ 97 284 0 10 Nov 2019
Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection Vicent Sanz Marco Ben Taylor Ziyi Wang Y. Elkhatib 70 61 0 09 Nov 2019
A Simplified Fully Quantized Transformer for End-to-end Speech Recognition Alex Bie Bharat Venkitesh João Monteiro Md. Akmal Haidar Mehdi Rezagholizadeh MQ 139 27 0 09 Nov 2019
On-Device Machine Learning: An Algorithms and Learning Theory Perspective Sauptik Dhar Junyao Guo Jiayi Liu S. Tripathi Unmesh Kurup Mohak Shah 131 144 0 02 Nov 2019
Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers Xishan Zhang Shaoli Liu Rui Zhang Chang-Shu Liu Di Huang ... Jiaming Guo Yu Kang Qi Guo Zidong Du Yunji Chen MQ 56 7 0 01 Nov 2019
On Distributed Quantization for Classification Osama A. Hanna Yahya H. Ezzeldin Tara Sadjadpour Christina Fragouli Suhas Diggavi MQ 81 14 0 01 Nov 2019
In-Place Zero-Space Memory Protection for CNN Hui Guan Lin Ning Zhen Lin Xipeng Shen Huiyang Zhou Seung-Hwan Lim 60 28 0 31 Oct 2019
Secure Evaluation of Quantized Neural Networks Anders Dalskov Daniel E. Escudero Marcel Keller 109 143 0 28 Oct 2019
Neural Network Distiller: A Python Package For DNN Compression Research Neta Zmora Guy Jacob Lev Zlotnik Bar Elharar Gal Novik 58 74 0 27 Oct 2019
Reversible designs for extreme memory cost reduction of CNN training T. Hascoet Q. Febvre Y. Ariki T. Takiguchi 3DV 15 2 0 24 Oct 2019
Fully Quantized Transformer for Machine Translation Gabriele Prato Ella Charlaix Mehdi Rezagholizadeh MQ 77 70 0 17 Oct 2019
Neural Network Design for Energy-Autonomous AI Applications using Temporal Encoding S. Mileiko Thanasin Bunnam F. Xia Rishad Shafik Alex Yakovlev Shidhartha Das 25 0 0 15 Oct 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019 Andrey D. Ignatov Radu Timofte Andrei Kulik Seungsoo Yang Ke Wang Felix Baum Max Wu Lirong Xu Luc Van Gool ELM 53 222 0 15 Oct 2019
Q8BERT: Quantized 8Bit BERT Ofir Zafrir Guy Boudoukh Peter Izsak Moshe Wasserblat MQ 112 507 0 14 Oct 2019
Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-based Approach Haichuan Yang Shupeng Gui Yuhao Zhu Ji Liu MQ 71 5 0 14 Oct 2019
OverQ: Opportunistic Outlier Quantization for Neural Network Accelerators Ritchie Zhao Jordan Dotzel Zhanqiu Hu Preslav Ivanov Christopher De Sa Zhiru Zhang MQ 36 1 0 13 Oct 2019
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM Skanda Koppula Lois Orosa A. G. Yaglikçi Roknoddin Azizi Taha Shahroodi Konstantinos Kanellopoulos O. Mutlu 80 108 0 12 Oct 2019
QPyTorch: A Low-Precision Arithmetic Simulation Framework Tianyi Zhang Zhiqiu Lin Guandao Yang Christopher De Sa MQ 69 66 0 09 Oct 2019
Bit Efficient Quantization for Deep Neural Networks Prateeth Nayak David C. Zhang S. Chai MQ 75 44 0 07 Oct 2019
Neural networks on microcontrollers: saving memory at inference via operator reordering Edgar Liberis Nicholas D. Lane 66 46 0 02 Oct 2019
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques Yiyuan Ma Li-Wen Chang Yang Chen Kefeng Deng Amit Agarwal Emad Barsoum Abe Taha MQ 37 7 0 01 Oct 2019
Automated design of error-resilient and hardware-efficient deep neural networks Christoph Schorn T. Elsken Sebastian Vogel Armin Runge A. Guntoro G. Ascheid AAML 47 32 0 30 Sep 2019
AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference Thierry Tambe En-Yu Yang Zishen Wan Yuntian Deng Vijay Janapa Reddi Alexander M. Rush David Brooks Gu-Yeon Wei MQ 58 21 0 29 Sep 2019
Training convolutional neural networks with cheap convolutions and online distillation Jiao Xie Shaohui Lin Yichen Zhang Linkai Luo 63 12 0 28 Sep 2019
Optimizing Speech Recognition For The Edge Yuan Shangguan Jian Li Qiao Liang R. Álvarez Ian McGraw 87 64 0 26 Sep 2019
Balanced Binary Neural Networks with Gated Residual Mingzhu Shen Xianglong Liu Ruihao Gong Kai Han MQ 72 36 0 26 Sep 2019
Structured Binary Neural Networks for Image Recognition Bohan Zhuang Chunhua Shen Mingkui Tan Peng Chen Lingqiao Liu Ian Reid MQ 135 19 0 22 Sep 2019
Density Encoding Enables Resource-Efficient Randomly Connected Neural Networks Denis Kleyko Mansour Kheffache E. P. Frady U. Wiklund Evgeny Osipov 63 46 0 19 Sep 2019
CrypTFlow: Secure TensorFlow Inference Nishant Kumar Mayank Rathee Nishanth Chandran Divya Gupta Aseem Rastogi Rahul Sharma 156 244 0 16 Sep 2019