Fixed Point Quantization of Deep Convolutional Networks

19 November 2015

Papers citing "Fixed Point Quantization of Deep Convolutional Networks"

50 / 124 papers shown

Title
Distributed Quantum Neural Networks on Distributed Photonic Quantum Computing Kuan-Cheng Chen Chen-Yu Liu Yu Shang Felix Burt Kin K Leung 44 0 0 13 May 2025
Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks Erin Carson Xinye Chen 54 0 0 10 Apr 2025
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning Sanghwan Bae Jiwoo Hong Min Young Lee Hanbyul Kim Jeongyeon Nam Donghyun Kwak OffRL LRM 53 0 0 04 Apr 2025
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance Jaskirat Singh Bram Adams Ahmed E. Hassan VLM 45 0 0 01 Nov 2024
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity Kanghyun Choi Hyeyoon Lee Dain Kwon Sunjong Park Kyuyeun Kim Noseong Park Jinho Lee Jinho Lee MQ 50 1 0 29 Jul 2024
DE $^3$ -BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks Jianing He Qi Zhang Weiping Ding Duoqian Miao Jun Zhao Liang Hu LongBing Cao 38 3 0 03 Feb 2024
In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms Philipp Schilk Niccolò Polvani Andrea Ronco Milos Cernak Michele Magno 37 12 0 05 Sep 2023
Learning Discrete Weights and Activations Using the Local Reparameterization Trick G. Berger Aviv Navon Ethan Fetaya MQ 22 0 0 04 Jul 2023
Task-Oriented Communication Design at Scale Arsham Mostaani T. Vu Hamed Habibi Symeon Chatzinotas Björn E. Ottersten 29 3 0 15 May 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer Junrui Xiao Zhikai Li Lianwei Yang Qingyi Gu MQ 32 12 0 11 May 2023
Are Visual Recognition Models Robust to Image Compression? Joao Maria Janeiro Stanislav Frolov Alaaeldin El-Nouby Jakob Verbeek VLM 28 4 0 10 Apr 2023
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks Cheng Gong Ye Lu Surong Dai Deng Qian Chenkun Du Tao Li MQ 29 0 0 07 Apr 2023
A Heterogeneous Parallel Non-von Neumann Architecture System for Accurate and Efficient Machine Learning Molecular Dynamics Zhuoying Zhao Ziling Tan Pinghui Mo Xiaonan Wang Dan Zhao Xin Zhang Ming Tao Jie Liu 24 1 0 26 Mar 2023
Fixed-point quantization aware training for on-device keyword-spotting Sashank Macha Om Oza Alex Escott Francesco Calivá Robert M. Armitano S. Cheekatmalla S. Parthasarathi Yuzong Liu MQ 21 4 0 04 Mar 2023
Rethinking Data-Free Quantization as a Zero-Sum Game Biao Qian Yang Wang Richang Hong Meng Wang MQ 18 17 0 19 Feb 2023
Towards Implementing Energy-aware Data-driven Intelligence for Smart Health Applications on Mobile Platforms G. D. Samaraweera Hung Nguyen Hadi Zanddizari Behnam Zeinali Jerome Chang 30 0 0 01 Feb 2023
The Hidden Power of Pure 16-bit Floating-Point Neural Networks Juyoung Yun Byungkon Kang Zhoulai Fu MQ 26 1 0 30 Jan 2023
QEBVerif: Quantization Error Bound Verification of Neural Networks Yedi Zhang Fu Song Jun Sun MQ 26 11 0 06 Dec 2022
OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the Memory Usage of Neural Networks Benoit Steiner Mostafa Elhoushi Jacob Kahn James Hegarty 31 8 0 24 Oct 2022
MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos Mathias Parger Chengcheng Tang Thomas Neff Christopher D. Twigg Cem Keskin Robert Y. Wang M. Steinberger 27 6 0 18 Oct 2022
Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction T. M. Khan Syed S. Naqvi A. Robles-Kelly Erik H. W. Meijering 44 7 0 14 Oct 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey Dalin Zhang Kaixuan Chen Yan Zhao B. Yang Li-Ping Yao Christian S. Jensen 48 3 0 22 Aug 2022
FP8 Quantization: The Power of the Exponent Andrey Kuzmin M. V. Baalen Yuwei Ren Markus Nagel Jorn W. T. Peters Tijmen Blankevoort MQ 25 80 0 19 Aug 2022
Mixed-Precision Neural Networks: A Survey M. Rakka M. Fouda Pramod P. Khargonekar Fadi J. Kurdahi MQ 25 11 0 11 Aug 2022
Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA Cecilia Latotzke Tim Ciesielski T. Gemmeke MQ 13 7 0 09 Aug 2022
Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers Ji Xin Raphael Tang Zhiying Jiang Yaoliang Yu Jimmy J. Lin 20 1 0 31 Jul 2022
CoNLoCNN: Exploiting Correlation and Non-Uniform Quantization for Energy-Efficient Low-precision Deep Convolutional Neural Networks Muhammad Abdullah Hanif G. M. Sarda Alberto Marchisio Guido Masera Maurizio Martina Muhammad Shafique MQ 22 4 0 31 Jul 2022
DarKnight: An Accelerated Framework for Privacy and Integrity Preserving Deep Learning Using Trusted Hardware H. Hashemi Yongqin Wang M. Annavaram FedML 26 58 0 30 Jun 2022
Impala: Low-Latency, Communication-Efficient Private Deep Learning Inference Woojin Choi Brandon Reagen Gu-Yeon Wei David Brooks FedML 53 7 0 13 May 2022
FlexBlock: A Flexible DNN Training Accelerator with Multi-Mode Block Floating Point Support Seock-Hwan Noh Jahyun Koo Seunghyun Lee Jongse Park Jaeha Kung AI4CE 32 17 0 13 Mar 2022
Engineering the Neural Automatic Passenger Counter Nico Jahn Michael Siebert 13 2 0 02 Mar 2022
Multi-task Learning Approach for Modulation and Wireless Signal Classification for 5G and Beyond: Edge Deployment via Model Compression Anu Jagannath Jithin Jagannath 28 26 0 26 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment Jemin Lee Misun Yu Yongin Kwon Teaho Kim MQ 25 17 0 10 Feb 2022
Accelerating DNN Training with Structured Data Gradient Pruning Bradley McDanel Helia Dinh J. Magallanes 17 7 0 01 Feb 2022
Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks G. Cerutti Lukas Cavigelli Renzo Andri Michele Magno Elisabetta Farella Luca Benini 26 14 0 10 Jan 2022
Speedup deep learning models on GPU by taking advantage of efficient unstructured pruning and bit-width reduction Marcin Pietroñ Dominik Zurek 30 13 0 28 Dec 2021
Neural Network Quantization for Efficient Inference: A Survey Olivia Weng MQ 28 23 0 08 Dec 2021
Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks Mahmood Azhar Qureshi Arslan Munir 30 0 0 09 Nov 2021
BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge Abdelrahman I. Hosny Marina Neseem Sherief Reda MQ 35 4 0 29 Oct 2021
Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme Shaojie Li Jie Wu Xuefeng Xiao Rongrong Ji Xudong Mao Rongrong Ji 23 35 0 27 Oct 2021
Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks Yikai Wang Yi Yang Gang Hua Anbang Yao MQ 29 15 0 18 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization Yelysei Bondarenko Markus Nagel Tijmen Blankevoort MQ 25 133 0 27 Sep 2021
MAFAT: Memory-Aware Fusing and Tiling of Neural Networks for Accelerated Edge Inference J. Farley A. Gerstlauer FedML 28 5 0 14 Jul 2021
Smoothed Differential Privacy Ao Liu Yu-Xiang Wang Lirong Xia 33 0 0 04 Jul 2021
Knowledge distillation: A good teacher is patient and consistent Lucas Beyer Xiaohua Zhai Amelie Royer L. Markeeva Rohan Anil Alexander Kolesnikov VLM 44 287 0 09 Jun 2021
Measuring what Really Matters: Optimizing Neural Networks for TinyML Lennart Heim Andreas Biri Zhongnan Qu Lothar Thiele 49 30 0 21 Apr 2021
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators David Stutz Nandhini Chandramoorthy Matthias Hein Bernt Schiele AAML MQ 24 18 0 16 Apr 2021
Efficient Video Compression via Content-Adaptive Super-Resolution Mehrdad Khani Shirkoohi Vibhaalakshmi Sivaraman Mohammad Alizadeh SupR 34 49 0 06 Apr 2021
Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric Quantizer Phuoc Pham J. Abraham Jaeyong Chung MQ 37 11 0 01 Apr 2021
Toward Compact Deep Neural Networks via Energy-Aware Pruning Seul-Ki Yeom Kyung-Hwan Shim Jee-Hyun Hwang CVBM 28 12 0 19 Mar 2021