Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,298 papers shown

Title
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes Sanghyun Hong Michael-Andrei Panaitescu-Liess Yigitcan Kaya Tudor Dumitras MQ 82 13 0 26 Oct 2021
Applications and Techniques for Fast Machine Learning in Science A. Deiana Nhan Tran Joshua C. Agar Michaela Blott G. D. Guglielmo ... Ashish Sharma S. Summers Pietro Vischia J. Vlimant Olivia Weng 94 72 0 25 Oct 2021
A TinyML Platform for On-Device Continual Learning with Quantized Latent Replays Leonardo Ravaglia Manuele Rusci D. Nadalini Alessandro Capotondi Francesco Conti Luca Benini BDL 106 68 0 20 Oct 2021
EBJR: Energy-Based Joint Reasoning for Adaptive Inference Mohammad Akbari Amin Banitalebi-Dehkordi Yong Zhang BDL MQ 82 7 0 20 Oct 2021
Dynamic Slimmable Denoising Network Zutao Jiang Changlin Li Xiaojun Chang Jihua Zhu Yi Yang AI4CE 31 16 0 17 Oct 2021
Hydra: A System for Large Multi-Model Deep Learning Kabir Nagrecha Arun Kumar MoE AI4CE 73 5 0 16 Oct 2021
Differentiable Network Pruning for Microcontrollers Edgar Liberis Nicholas D. Lane 91 22 0 15 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations Xinyu Zhang Ian Colbert Ken Kreutz-Delgado Srinjoy Das MQ 100 12 0 15 Oct 2021
PTQ-SL: Exploring the Sub-layerwise Post-training Quantization Zhihang Yuan Yiqi Chen Chenhao Xue Chenguang Zhang Qiankun Wang Guangyu Sun MQ 28 3 0 15 Oct 2021
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization Weihan Chen Peisong Wang Jian Cheng MQ 93 69 0 13 Oct 2021
Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression Zhuang Shao Xiaoliang Chen Li Du Lei Chen Yuan Du Weihao Zhuang Huadong Wei Chenjia Xie Zhongfeng Wang 40 27 0 12 Oct 2021
LightSeq2: Accelerated Training for Transformer-based Models on GPUs Xiaohui Wang Yang Wei Ying Xiong Guyue Huang Xian Qian Yufei Ding Mingxuan Wang Lei Li VLM 62 33 0 12 Oct 2021
LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time Elvis Nunez Maxwell Horton Anish K. Prabhu Anurag Ranjan Ali Farhadi Mohammad Rastegari 68 4 0 08 Oct 2021
Token Pooling in Vision Transformers D. Marin Jen-Hao Rick Chang Anurag Ranjan Anish K. Prabhu Mohammad Rastegari Oncel Tuzel ViT 143 71 0 08 Oct 2021
The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation Orevaoghene Ahia Julia Kreutzer Sara Hooker 188 55 0 06 Oct 2021
Shifting Capsule Networks from the Cloud to the Deep Edge Miguel Costa Diogo Costa T. Gomes Sandro Pinto 86 6 0 06 Oct 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer Sachin Mehta Mohammad Rastegari ViT 302 1,300 0 05 Oct 2021
Pre-Quantized Deep Learning Models Codified in ONNX to Enable Hardware/Software Co-Design U. Hanebutte Andrew Baldwin S. Duraković I. Filipovich Chien-Chun Chou Chou Damian Adamowicz Derek Chickles David Hawkes MQ 32 2 0 04 Oct 2021
Progressive Transmission and Inference of Deep Learning Models Youngsoo Lee Sangdoo Yun Yeonghun Kim Sunghee Choi 44 2 0 03 Oct 2021
SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference Jude Haris Perry Gibson José Cano Nicolas Bohm Agostini David Kaeli 91 19 0 01 Oct 2021
Semi-tensor Product-based TensorDecomposition for Neural Network Compression Hengling Zhao Yipeng Liu Xiaolin Huang Ce Zhu 79 6 0 30 Sep 2021
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition Marawan Gamal Abdel Hameed Marzieh S. Tahaei A. Mosleh V. Nia 90 26 0 29 Sep 2021
Smart at what cost? Characterising Mobile Deep Neural Networks in the wild Mario Almeida Stefanos Laskaridis Abhinav Mehrotra Łukasz Dudziak Ilias Leontiadis Nicholas D. Lane HAI 161 47 0 28 Sep 2021
Consistency Training of Multi-exit Architectures for Sensor Data Aaqib Saeed 34 1 0 27 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization Yelysei Bondarenko Markus Nagel Tijmen Blankevoort MQ 83 146 0 27 Sep 2021
Deep Structured Instance Graph for Distilling Object Detectors Yixin Chen Pengguang Chen Shu Liu Liwei Wang Jiaya Jia ObjD ISeg 61 12 0 27 Sep 2021
Chess AI: Competing Paradigms for Machine Intelligence Shivanand Maharaj Nicholas G. Polson Alex Turk ELM 88 26 0 23 Sep 2021
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers Changlin Li Guangrun Wang Bing Wang Xiaodan Liang Zhihui Li Xiaojun Chang 96 9 0 21 Sep 2021
Robustness Analysis of Deep Learning Frameworks on Mobile Platforms Amin Eslami Abyane Hadi Hemmati AAML 77 3 0 20 Sep 2021
Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework Mohamed Bennai Alberto Marchisio Rachmad Vidya Wicaksana Putra Muhammad Abdullah Hanif 98 34 0 20 Sep 2021
iRNN: Integer-only Recurrent Neural Network Eyyub Sari Vanessa Courville V. Nia MQ 85 4 0 20 Sep 2021
HPTQ: Hardware-Friendly Post Training Quantization H. Habi Reuven Peretz Elad Cohen Lior Dikstein Oranit Dror I. Diamant Roy H. Jennings Arnon Netzer MQ 83 9 0 19 Sep 2021
Comfetch: Federated Learning of Large Networks on Constrained Clients via Sketching Tahseen Rabbani Brandon Yushan Feng Marco Bornstein Kyle Rui Sang Yifan Yang Arjun Rajkumar A. Varshney Furong Huang FedML 119 2 0 17 Sep 2021
Phrase Retrieval Learns Passage Retrieval, Too Jinhyuk Lee Alexander Wettig Danqi Chen RALM DML 82 48 0 16 Sep 2021
Complexity-aware Adaptive Training and Inference for Edge-Cloud Distributed AI Systems Yinghan Long I. Chakraborty G. Srinivasan Kaushik Roy 61 15 0 14 Sep 2021
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency Yonggan Fu Yang Zhao Qixuan Yu Chaojian Li Yingyan Lin AAML 170 14 0 11 Sep 2021
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning Prasetya Ajie Utama N. Moosavi Victor Sanh Iryna Gurevych AAML 128 36 0 09 Sep 2021
Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks Cheng Gong Ye Lu Kunpeng Xie Zongming Jin Tao Li Yanzhi Wang MQ 64 7 0 08 Sep 2021
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables B. Prabakaran Asima Akhtar Semeen Rehman Osman Hasan Mohamed Bennai 26 10 0 07 Sep 2021
GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization Yi Guo Huan Yuan Jianchao Tan Zhangyang Wang Sen Yang Ji Liu 92 46 0 06 Sep 2021
Spiking Neural Networks with Improved Inherent Recurrence Dynamics for Sequential Learning Wachirawit Ponghiran Kaushik Roy 115 49 0 04 Sep 2021
On the Accuracy of Analog Neural Network Inference Accelerators T. Xiao Ben Feinberg C. Bennett V. Prabhakar Prashant Saxena V. Agrawal S. Agarwal M. Marinella 54 41 0 03 Sep 2021
Diverse Sample Generation: Pushing the Limit of Generative Data-free Quantization Haotong Qin Yifu Ding Xiangguo Zhang Jiakai Wang Xianglong Liu Jiwen Lu DiffM MQ 67 57 0 01 Sep 2021
Architecture Aware Latency Constrained Sparse Neural Networks Tianli Zhao Qinghao Hu Xiangyu He Weixiang Xu Jiaxing Wang Cong Leng Jian Cheng 67 0 0 01 Sep 2021
Pruning with Compensation: Efficient Channel Pruning for Deep Convolutional Neural Networks Zhouyang Xie Yan Fu Sheng-Zhao Tian Junlin Zhou Duanbing Chen 3DV 48 0 0 31 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI Amin Banitalebi-Dehkordi Naveen Vedula J. Pei Fei Xia Lanjun Wang Yong Zhang 79 92 0 30 Aug 2021
Compact representations of convolutional neural networks via weight pruning and quantization Giosuè Cataldo Marinò A. Petrini D. Malchiodi Marco Frasca MQ 23 4 0 28 Aug 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation Samuel Cahyawijaya 103 12 0 24 Aug 2021
On the Acceleration of Deep Neural Network Inference using Quantized Compressed Sensing Meshia Cédric Oveneke MQ 49 0 0 23 Aug 2021
Supervised Compression for Resource-Constrained Edge Computing Systems Yoshitomo Matsubara Ruihan Yang Marco Levorato Stephan Mandt 118 58 0 21 Aug 2021