Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown

Title
Neural Machine Translation with 4-Bit Precision and Beyond Alham Fikri Aji Kenneth Heafield MQ 22 7 0 13 Sep 2019
Differentiable Mask for Pruning Convolutional and Recurrent Networks R. Ramakrishnan Eyyub Sari V. Nia VLM 82 15 0 10 Sep 2019
PULP-NN: Accelerating Quantized Neural Networks on Parallel Ultra-Low-Power RISC-V Processors Angelo Garofalo Manuele Rusci Francesco Conti D. Rossi Luca Benini MQ 66 137 0 29 Aug 2019
Real-time Person Re-identification at the Edge: A Mixed Precision Approach Mohammadreza Baharani Shrey Mohan Hamed Tabkhi 54 10 0 19 Aug 2019
Adaptative Inference Cost With Convolutional Neural Mixture Models Adria Ruiz Jakob Verbeek VLM 65 22 0 19 Aug 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks Ruihao Gong Xianglong Liu Shenghu Jiang Tian-Hao Li Peng Hu Jiazhen Lin F. Yu Junjie Yan MQ 102 460 0 14 Aug 2019
Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations Bohan Zhuang Jing Liu Mingkui Tan Lingqiao Liu Ian Reid Chunhua Shen MQ 99 46 0 10 Aug 2019
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge H. F. Langroudi Zachariah Carmichael David Pastuch Dhireesha Kudithipudi 54 24 0 06 Aug 2019
Deep Learning Training on the Edge with Low-Precision Posits H. F. Langroudi Zachariah Carmichael Dhireesha Kudithipudi MQ 65 14 0 30 Jul 2019
Similarity-Preserving Knowledge Distillation Frederick Tung Greg Mori 134 985 0 23 Jul 2019
Batch-Shaping for Learning Conditional Channel Gated Networks B. Bejnordi Tijmen Blankevoort Max Welling AI4CE 92 77 0 15 Jul 2019
Neural Epitome Search for Architecture-Agnostic Network Compression Daquan Zhou Xiaojie Jin Qibin Hou Kaixin Wang Jianchao Yang Jiashi Feng 91 13 0 12 Jul 2019
Template-Based Posit Multiplication for Training and Inferring in Neural Networks Raul Murillo Alberto A. Del Barrio Guillermo Botella Juan 36 16 0 09 Jul 2019
Data-Independent Neural Pruning via Coresets Ben Mussay Margarita Osadchy Vladimir Braverman Samson Zhou Dan Feldman 104 60 0 09 Jul 2019
QUOTIENT: Two-Party Secure Neural Network Training and Prediction Nitin Agrawal Ali Shahin Shamsabadi Matt J. Kusner Adria Gascon 102 216 0 08 Jul 2019
Weight Normalization based Quantization for Deep Neural Network Compression Wenhong Cai Wu-Jun Li 48 14 0 01 Jul 2019
GAN-Knowledge Distillation for one-stage Object Detection Wanwei Wang Jin ke Yu Fan Zong ObjD 52 29 0 20 Jun 2019
A One-step Pruning-recovery Framework for Acceleration of Convolutional Neural Networks Dong Wang Lei Zhou Xiao Bai Jun Zhou 28 2 0 18 Jun 2019
Visual Wake Words Dataset Aakanksha Chowdhery Pete Warden Jonathon Shlens Andrew G. Howard Rocky Rhodes VLM 88 102 0 12 Jun 2019
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference Michele Covell David Marwood S. Baluja Nick Johnston MQ 41 7 0 11 Jun 2019
Data-Free Quantization Through Weight Equalization and Bias Correction Markus Nagel M. V. Baalen Tijmen Blankevoort Max Welling MQ 101 515 0 11 Jun 2019
DiCENet: Dimension-wise Convolutions for Efficient Networks Sachin Mehta Hannaneh Hajishirzi Mohammad Rastegari 101 43 0 08 Jun 2019
Fighting Quantization Bias With Bias Alexander Finkelstein Uri Almog Mark Grobman MQ 82 57 0 07 Jun 2019
Addressing Limited Weight Resolution in a Fully Optical Neuromorphic Reservoir Computing Readout Chonghuai Ma Floris Laporte J. Dambre P. Bienstman 26 10 0 06 Jun 2019
DeepShift: Towards Multiplication-Less Neural Networks Mostafa Elhoushi Zihao Chen F. Shafiq Ye Tian Joey Yiwei Li MQ 131 102 0 30 May 2019
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers Manuele Rusci Alessandro Capotondi Luca Benini MQ 97 75 0 30 May 2019
RecNets: Channel-wise Recurrent Convolutional Neural Networks George Retsinas Athena Elafrou G. Goumas Petros Maragos 29 2 0 28 May 2019
CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks Weicheng Li Rui Wang Zhongzhi Luan Di Huang Zidong Du Yunji Chen D. Qian 27 1 0 28 May 2019
OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks Jiashi Li Q. Qi Jingyu Wang Ce Ge Yujian Betterest Li Zhangzhang Yue Haifeng Sun BDL CML 101 53 0 28 May 2019
Seeing Convolution Through the Eyes of Finite Transformation Semigroup Theory: An Abstract Algebraic Interpretation of Convolutional Neural Networks Andrew Hryniowski A. Wong 48 0 0 26 May 2019
Feature Map Transform Coding for Energy-Efficient CNN Inference Brian Chmiel Chaim Baskin Ron Banner Evgenii Zheltonozhskii Yevgeny Yermolin Alex Karbachevsky A. Bronstein A. Mendelson 89 26 0 26 May 2019
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis Chaoqi Wang Roger C. Grosse Sanja Fidler Guodong Zhang 80 124 0 15 May 2019
EdgeSegNet: A Compact Network for Semantic Segmentation Z. Q. Lin Brendan Chwyl A. Wong SSeg 49 9 0 10 May 2019
Seesaw-Net: Convolution Neural Network With Uneven Group Convolution Jintao Zhang BDL 65 7 0 09 May 2019
Searching for MobileNetV3 Andrew G. Howard Mark Sandler Grace Chu Liang-Chieh Chen Bo Chen ... Yukun Zhu Ruoming Pang Vijay Vasudevan Quoc V. Le Hartwig Adam 455 6,872 0 06 May 2019
Creating Lightweight Object Detectors with Model Compression for Deployment on Edge Devices Yiwu Yao Weiqiang Yang Haoqi Zhu 37 0 0 06 May 2019
Parity Models: A General Framework for Coding-Based Resilience in ML Inference J. Kosaian K. V. Rashmi Shivaram Venkataraman 109 14 0 02 May 2019
Full-stack Optimization for Accelerating CNNs with FPGA Validation Bradley McDanel Shanghang Zhang H. T. Kung Xin Dong MQ 29 2 0 01 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision Zhen Dong Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 97 530 0 29 Apr 2019
Towards Efficient Model Compression via Learned Global Ranking Ting-Wu Chin Ruizhou Ding Cha Zhang Diana Marculescu 83 172 0 28 Apr 2019
Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks Y. Zur Chaim Baskin Evgenii Zheltonozhskii Brian Chmiel Itay Evron A. Bronstein A. Mendelson MQ 85 7 0 22 Apr 2019
Defensive Quantization: When Efficiency Meets Robustness Ji Lin Chuang Gan Song Han MQ 118 204 0 17 Apr 2019
Towards Real-Time Automatic Portrait Matting on Mobile Devices Seokjun Seo Seungwoo Choi Martin Kersner Beomjun Shin Hyungsuk Yoon Hyeongmin Byun S. Ha 3DH 21 3 0 08 Apr 2019
Progressive Stochastic Binarization of Deep Networks David Hartmann Michael Wand MQ 99 1 0 03 Apr 2019
Patchwork: A Patch-wise Attention Network for Efficient Object Detection and Segmentation in Video Streams Yuning Chai VOS 78 30 0 03 Apr 2019
Training Quantized Neural Networks with a Full-precision Auxiliary Module Bohan Zhuang Lingqiao Liu Mingkui Tan Chunhua Shen Ian Reid MQ 102 62 0 27 Mar 2019
Looking Fast and Slow: Memory-Guided Mobile Video Object Detection Mason Liu Menglong Zhu Marie White Yinxiao Li Dmitry Kalenichenko 84 83 0 25 Mar 2019
Towards Optimal Structured CNN Pruning via Generative Adversarial Learning Shaohui Lin Rongrong Ji Chenqian Yan Baochang Zhang Liujuan Cao QiXiang Ye Feiyue Huang David Doermann CVBM 58 510 0 22 Mar 2019
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks Sambhav R. Jain Albert Gural Michael Wu Chris Dick MQ 112 152 0 19 Mar 2019
AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design A. Wong Z. Q. Lin Brendan Chwyl HAI 57 14 0 18 Mar 2019