Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown

Title
Hybrid and Non-Uniform quantization methods using retro synthesis data for efficient inference Gvsl Tej Pratap R. Kumar MQ 59 1 0 26 Dec 2020
Low-latency Perception in Off-Road Dynamical Low Visibility Environments Nelson Alves Ferreira Neto Marco Ruiz M. Reis Tiago Cajahyba David F. N. Oliveira Ana Barreto Eduardo F. Simas Filho Wagner Luiz Alves de Oliveira L. Schnitman Roberto L. S. Monteiro 29 10 0 23 Dec 2020
Adaptive Precision Training for Resource Constrained Devices Tian Huang Yaoyu Zhang Qiufeng Wang 65 5 0 23 Dec 2020
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead Maurizio Capra Beatrice Bussolino Alberto Marchisio Guido Masera Maurizio Martina Mohamed Bennai BDL 138 147 0 21 Dec 2020
Efficient CNN-LSTM based Image Captioning using Neural Network Compression Harshit Rampal Aman Mohanty VLM 46 4 0 17 Dec 2020
Revisiting Linformer with a modified self-attention with linear complexity Madhusudan Verma 51 8 0 16 Dec 2020
Exploring Neural Networks Quantization via Layer-Wise Quantization Analysis Shachar Gluska Mark Grobman MQ 54 5 0 15 Dec 2020
Scalable Verification of Quantized Neural Networks (Technical Report) T. Henzinger Mathias Lechner Dorde Zikelic MQ 64 34 0 15 Dec 2020
Demystifying Deep Neural Networks Through Interpretation: A Survey Giang Dao Minwoo Lee FaML FAtt 66 1 0 13 Dec 2020
Privacy-Preserving Spam Filtering using Functional Encryption Sicong Wang Naveen Karunanayake Tham Nguyen Suranga Seneviratne 31 2 0 08 Dec 2020
An Once-for-All Budgeted Pruning Framework for ConvNets Considering Input Resolution Wenyu Sun Jian Cao Pengtao Xu Xiangcheng Liu Pu Li 36 0 0 02 Dec 2020
Solvable Model for Inheriting the Regularization through Knowledge Distillation Luca Saglietti Lenka Zdeborová 53 20 0 01 Dec 2020
A Tiny CNN Architecture for Medical Face Mask Detection for Resource-Constrained Endpoints P. Mohan A. Paul Abhay Chirania CVBM 64 49 0 30 Nov 2020
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and Quantization Het Shah Avishree Khare Neelay Shah Khizir Siddiqui MQ 45 6 0 30 Nov 2020
Robust Ultra-wideband Range Error Mitigation with Deep Learning at the Edge Simone Angarano Vittorio Mazzia Francesco Salvetti Giovanni Fantin Marcello Chiaberge 139 44 0 30 Nov 2020
FactorizeNet: Progressive Depth Factorization for Efficient Network Architecture Exploration Under Quantization Constraints S. Yun A. Wong MQ 29 2 0 30 Nov 2020
Where Should We Begin? A Low-Level Exploration of Weight Initialization Impact on Quantized Behaviour of Deep Neural Networks S. Yun A. Wong MQ 46 4 0 30 Nov 2020
Bringing AI To Edge: From Deep Learning's Perspective Di Liu Hao Kong Xiangzhong Luo Weichen Liu Ravi Subramaniam 116 124 0 25 Nov 2020
Auto Graph Encoder-Decoder for Neural Network Pruning Sixing Yu Arya Mazaheri Ali Jannesari GNN 81 40 0 25 Nov 2020
HAWQV3: Dyadic Neural Network Quantization Z. Yao Zhen Dong Zhangcheng Zheng A. Gholami Jiali Yu ... Leyuan Wang Qijing Huang Yida Wang Michael W. Mahoney Kurt Keutzer MQ 128 87 0 20 Nov 2020
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet Vocoder Sam Davis Giuseppe Coccia Sam Gooch Julian Mack 38 0 0 20 Nov 2020
Layer-Wise Data-Free CNN Compression Maxwell Horton Yanzi Jin Ali Farhadi Mohammad Rastegari MQ 67 17 0 18 Nov 2020
Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks Jun Nishikawa Ryoji Ikegaya MQ 34 1 0 13 Nov 2020
ATCN: Resource-Efficient Processing of Time Series on Edge Mohammadreza Baharani Hamed Tabkhi AI4TS 81 1 0 10 Nov 2020
Neural Network Compression Via Sparse Optimization Tianyi Chen Bo Ji Yixin Shi Tianyu Ding Biyi Fang Sheng Yi Xiao Tu 84 16 0 10 Nov 2020
FRILL: A Non-Semantic Speech Embedding for Mobile Devices J. Peplinski Joel Shor Sachin P. Joglekar Jake Garrison Shwetak N. Patel 68 24 0 09 Nov 2020
PAMS: Quantized Super-Resolution via Parameterized Max Scale Huixia Li Chenqian Yan Shaohui Lin Xiawu Zheng Yuchao Li Baochang Zhang Fan Yang Rongrong Ji MQ 76 86 0 09 Nov 2020
ReFloat: Low-Cost Floating-Point Processing in ReRAM for Accelerating Iterative Linear Solvers Linghao Song Fan Chen Xuehai Qian Hai Li Yiran Chen 62 6 0 06 Nov 2020
Paralinguistic Privacy Protection at the Edge Ranya Aloufi Hamed Haddadi David E. Boyle 66 14 0 04 Nov 2020
Methods for Pruning Deep Neural Networks S. Vadera Salem Ameen 3DPC 76 131 0 31 Oct 2020
Visually Guided Balloon Popping with an Autonomous MAV at MBZIRC 2020 Marius Beul S. Bultmann Andre Rochow R. Rosu Daniel Schleich Malte Splietker Sven Behnke 50 8 0 28 Oct 2020
Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets Kai Han Yunhe Wang Qiulin Zhang Wei Zhang Chunjing Xu Tong Zhang 74 89 0 28 Oct 2020
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks Jianfei Chen Yujie Gai Z. Yao Michael W. Mahoney Joseph E. Gonzalez MQ 73 59 0 27 Oct 2020
$μ$ NAS: Constrained Neural Architecture Search for Microcontrollers Edgar Liberis Łukasz Dudziak Nicholas D. Lane BDL 63 106 0 27 Oct 2020
Pre-trained Summarization Distillation Sam Shleifer Alexander M. Rush 69 103 0 24 Oct 2020
MARS: Multi-macro Architecture SRAM CIM-Based Accelerator with Co-designed Compressed Neural Networks Syuan-Hao Sie Jye-Luen Lee Yi-Ren Chen Chih-Cheng Lu C. Hsieh Meng-Fan Chang K. Tang 36 14 0 24 Oct 2020
Adaptive Pixel-wise Structured Sparse Network for Efficient CNNs Chen Tang Wenyu Sun Zhuqing Yuan Yongpan Liu 30 0 0 21 Oct 2020
Characterizing and Taming Model Instability Across Edge Devices Eyal Cidon Evgenya Pergament Zain Asgar Asaf Cidon Sachin Katti 63 7 0 18 Oct 2020
CrypTFlow2: Practical 2-Party Secure Inference Deevashwer Rathee Mayank Rathee Nishant Kumar Nishanth Chandran Divya Gupta Aseem Rastogi Rahul Sharma 139 319 0 13 Oct 2020
S3ML: A Secure Serving System for Machine Learning Inference Junming Ma Chaofan Yu Aihui Zhou Bingzhe Wu Xibin Wu Xingyu Chen Xiangqun Chen Lei Wang Donggang Cao 43 3 0 13 Oct 2020
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification Yulin Wang Kangchen Lv Rui Huang Shiji Song Le Yang Gao Huang 3DH 57 151 0 11 Oct 2020
Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer Mahdi Ghorbani Fahimeh Fooladgar S. Kasaei AAML 53 0 0 09 Oct 2020
Real-time Mask Detection on Google Edge TPU Keondo Park Won Jang Woochul Lee K. Nam Kihong Seong Kyuwook Chai Wen-Syan Li 52 13 0 09 Oct 2020
Characterising Bias in Compressed Models Sara Hooker Nyalleng Moorosi Gregory Clark Samy Bengio Emily L. Denton 79 185 0 06 Oct 2020
A Survey on Deep Neural Network Compression: Challenges, Overview, and Solutions Rahul Mishra Hari Prabhat Gupta Tanima Dutta 66 93 0 05 Oct 2020
Joint Pruning & Quantization for Extremely Sparse Neural Networks Po-Hsiang Yu Sih-Sian Wu Jan P. Klopp Liang-Gee Chen Shao-Yi Chien MQ 79 16 0 05 Oct 2020
AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via Visual Attention Condensers A. Wong M. Famouri M. Shafiee 66 20 0 30 Sep 2020
NITI: Training Integer Neural Networks Using Integer-only Arithmetic Maolin Wang Seyedramin Rasoulinezhad Philip H. W. Leong Hayden Kwok-Hay So MQ 49 41 0 28 Sep 2020
Towards Fully 8-bit Integer Inference for the Transformer Model Ye Lin Yanyang Li Tengbo Liu Tong Xiao Tongran Liu Jingbo Zhu MQ 78 63 0 17 Sep 2020
Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation Insoo Chung Byeongwook Kim Yoonjung Choi S. Kwon Yongkweon Jeon Baeseong Park Sangha Kim Dongsoo Lee MQ 95 27 0 16 Sep 2020