ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.08295
  4. Cited By
A White Paper on Neural Network Quantization

A White Paper on Neural Network Quantization

15 June 2021
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
    MQ
ArXivPDFHTML

Papers citing "A White Paper on Neural Network Quantization"

47 / 247 papers shown
Title
Quantized Neural Networks for Low-Precision Accumulation with Guaranteed
  Overflow Avoidance
Quantized Neural Networks for Low-Precision Accumulation with Guaranteed Overflow Avoidance
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
MQ
19
4
0
31 Jan 2023
BOMP-NAS: Bayesian Optimization Mixed Precision NAS
BOMP-NAS: Bayesian Optimization Mixed Precision NAS
David van Son
F. D. Putter
Sebastian Vogel
Henk Corporaal
MQ
27
3
0
27 Jan 2023
Optimized learned entropy coding parameters for practical neural-based
  image and video compression
Optimized learned entropy coding parameters for practical neural-based image and video compression
A. Said
Reza Pourreza
H. Le
MQ
35
2
0
20 Jan 2023
Person Detection Using an Ultra Low-resolution Thermal Imager on a
  Low-cost MCU
Person Detection Using an Ultra Low-resolution Thermal Imager on a Low-cost MCU
Maarten Vandersteegen
Wouter Reusen
Kristof Van Beeck
Toon Goedemé
23
2
0
16 Dec 2022
PD-Quant: Post-Training Quantization based on Prediction Difference
  Metric
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
Jiawei Liu
Lin Niu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
96
68
0
14 Dec 2022
QVIP: An ILP-based Formal Verification Approach for Quantized Neural
  Networks
QVIP: An ILP-based Formal Verification Approach for Quantized Neural Networks
Yedi Zhang
Zhe Zhao
Fu Song
Mengdi Zhang
Tao Chen
Jun Sun
36
17
0
10 Dec 2022
QEBVerif: Quantization Error Bound Verification of Neural Networks
QEBVerif: Quantization Error Bound Verification of Neural Networks
Yedi Zhang
Fu Song
Jun Sun
MQ
26
11
0
06 Dec 2022
QFT: Post-training quantization via fast joint finetuning of all degrees
  of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
26
6
0
05 Dec 2022
Device Interoperability for Learned Image Compression with Weights and
  Activations Quantization
Device Interoperability for Learned Image Compression with Weights and Activations Quantization
Esin Koyuncu
T. Solovyev
Elena Alshina
Andre Kaup
21
10
0
02 Dec 2022
Post-training Quantization on Diffusion Models
Post-training Quantization on Diffusion Models
Yuzhang Shang
Zhihang Yuan
Bin Xie
Bingzhe Wu
Yan Yan
DiffM
MQ
15
159
0
28 Nov 2022
AskewSGD : An Annealed interval-constrained Optimisation method to train
  Quantized Neural Networks
AskewSGD : An Annealed interval-constrained Optimisation method to train Quantized Neural Networks
Louis Leconte
S. Schechtman
Eric Moulines
29
4
0
07 Nov 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained
  Transformers
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
24
886
0
31 Oct 2022
Neural Networks with Quantization Constraints
Neural Networks with Quantization Constraints
Ignacio Hounie
Juan Elenter
Alejandro Ribeiro
MQ
21
4
0
27 Oct 2022
Desiderata for next generation of ML model serving
Desiderata for next generation of ML model serving
Sherif Akoush
Andrei Paleyes
A. V. Looveren
Clive Cox
33
5
0
26 Oct 2022
Knowledge Distillation approach towards Melanoma Detection
Knowledge Distillation approach towards Melanoma Detection
Md Shakib Khan
Kazi Nabiul Alam
Abdur Rab Dhruba
H. Zunair
Nabeel Mohammed
32
23
0
14 Oct 2022
Inference Latency Prediction at the Edge
Inference Latency Prediction at the Edge
Zhuojin Li
Marco Paolieri
L. Golubchik
27
3
0
06 Oct 2022
SAMP: A Model Inference Toolkit of Post-Training Quantization for Text
  Processing via Self-Adaptive Mixed-Precision
SAMP: A Model Inference Toolkit of Post-Training Quantization for Text Processing via Self-Adaptive Mixed-Precision
Rong Tian
Zijing Zhao
Weijie Liu
Haoyan Liu
Weiquan Mao
Zhe Zhao
Kimmo Yan
MQ
19
5
0
19 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
82
31
0
14 Sep 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural
  Network Quantization
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
Cong Guo
Chen Zhang
Jingwen Leng
Zihan Liu
Fan Yang
Yun-Bo Liu
Minyi Guo
Yuhao Zhu
MQ
20
55
0
30 Aug 2022
Optimal Brain Compression: A Framework for Accurate Post-Training
  Quantization and Pruning
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
25
216
0
24 Aug 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning
  Models: A Survey
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey
Dalin Zhang
Kaixuan Chen
Yan Zhao
B. Yang
Li-Ping Yao
Christian S. Jensen
46
3
0
22 Aug 2022
FP8 Quantization: The Power of the Exponent
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin
M. V. Baalen
Yuwei Ren
Markus Nagel
Jorn W. T. Peters
Tijmen Blankevoort
MQ
25
78
0
19 Aug 2022
Boosting neural video codecs by exploiting hierarchical redundancy
Boosting neural video codecs by exploiting hierarchical redundancy
Reza Pourreza
H. Le
A. Said
Guillaume Sautière
Auke Wiggers
23
13
0
08 Aug 2022
Quantized Sparse Weight Decomposition for Neural Network Compression
Quantized Sparse Weight Decomposition for Neural Network Compression
Andrey Kuzmin
M. V. Baalen
Markus Nagel
Arash Behboodi
MQ
14
3
0
22 Jul 2022
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low
  Bit Quantization and Runtime
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime
Saad Ashfaq
Mohammadhossein Askarihemmat
Sudhakar Sah
Ehsan Saboori
Olivier Mastropietro
Alexander Hoffman
BDL
MQ
21
4
0
18 Jul 2022
MobileCodec: Neural Inter-frame Video Compression on Mobile Devices
MobileCodec: Neural Inter-frame Video Compression on Mobile Devices
H. Le
Liang Zhang
A. Said
Guillaume Sautière
Yang Yang
Pranav Shrestha
Fei Yin
Reza Pourreza
Auke Wiggers
28
30
0
18 Jul 2022
Quantization Robust Federated Learning for Efficient Inference on
  Heterogeneous Devices
Quantization Robust Federated Learning for Efficient Inference on Heterogeneous Devices
Kartik Gupta
Marios Fournarakis
M. Reisser
Christos Louizos
Markus Nagel
FedML
19
14
0
22 Jun 2022
Wavelet Feature Maps Compression for Image-to-Image CNNs
Wavelet Feature Maps Compression for Image-to-Image CNNs
Shahaf E. Finder
Yair Zohav
Maor Ashkenazi
Eran Treister
19
17
0
24 May 2022
RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training
  Quantization
RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization
Hongyi Yao
Pu Li
Jian Cao
Xiangcheng Liu
Chenying Xie
Bin Wang
MQ
23
12
0
26 Apr 2022
Vision Transformer Compression with Structured Pruning and Low Rank
  Approximation
Vision Transformer Compression with Structured Pruning and Low Rank Approximation
Ankur Kumar
ViT
26
6
0
25 Mar 2022
Overcoming Oscillations in Quantization-Aware Training
Overcoming Oscillations in Quantization-Aware Training
Markus Nagel
Marios Fournarakis
Yelysei Bondarenko
Tijmen Blankevoort
MQ
111
101
0
21 Mar 2022
TinyMLOps: Operational Challenges for Widespread Edge AI Adoption
TinyMLOps: Operational Challenges for Widespread Edge AI Adoption
Sam Leroux
Pieter Simoens
Meelis Lootus
Kartik Thakore
Akshay Sharma
24
16
0
21 Mar 2022
An Empirical Study of Low Precision Quantization for TinyML
An Empirical Study of Low Precision Quantization for TinyML
Shaojie Zhuo
Hongyu Chen
R. Ramakrishnan
Tommy Chen
Chen Feng
Yi-Rung Lin
Parker Zhang
Liang Shen
MQ
32
13
0
10 Mar 2022
Post-Training Quantization for Cross-Platform Learned Image Compression
Post-Training Quantization for Cross-Platform Learned Image Compression
Dailan He
Zi Yang
Yuan-Hsin Chen
Qi Zhang
Hongwei Qin
Yan Wang
MQ
42
13
0
15 Feb 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
24
48
0
10 Feb 2022
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory
  Footprint Reduction
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Georgii Sergeevich Novikov
Daniel Bershatsky
Julia Gusak
Alex Shonenkov
Denis Dimitrov
Ivan V. Oseledets
MQ
26
17
0
01 Feb 2022
SPDY: Accurate Pruning with Speedup Guarantees
SPDY: Accurate Pruning with Speedup Guarantees
Elias Frantar
Dan Alistarh
39
33
0
31 Jan 2022
Implicit Neural Video Compression
Implicit Neural Video Compression
Yunfan Zhang
T. V. Rozendaal
Johann Brehmer
Markus Nagel
Taco S. Cohen
49
57
0
21 Dec 2021
Accurate Neural Training with 4-bit Matrix Multiplications at Standard
  Formats
Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats
Brian Chmiel
Ron Banner
Elad Hoffer
Hilla Ben Yaacov
Daniel Soudry
MQ
30
22
0
19 Dec 2021
Instance-Adaptive Video Compression: Improving Neural Codecs by Training
  on the Test Set
Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set
T. V. Rozendaal
Johann Brehmer
Yunfan Zhang
Reza Pourreza
Auke Wiggers
Taco S. Cohen
39
24
0
19 Nov 2021
An Underexplored Dilemma between Confidence and Calibration in Quantized
  Neural Networks
An Underexplored Dilemma between Confidence and Calibration in Quantized Neural Networks
Guoxuan Xia
Sangwon Ha
Tiago Azevedo
Partha P. Maji
UQCV
17
1
0
10 Nov 2021
Understanding and Overcoming the Challenges of Efficient Transformer
  Quantization
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
HPTQ: Hardware-Friendly Post Training Quantization
HPTQ: Hardware-Friendly Post Training Quantization
H. Habi
Reuven Peretz
Elad Cohen
Lior Dikstein
Oranit Dror
I. Diamant
Roy H. Jennings
Arnon Netzer
MQ
31
8
0
19 Sep 2021
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image
  Super-Resolution Networks
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image Super-Resolution Networks
Chee Hong
Heewon Kim
Sungyong Baik
Junghun Oh
Kyoung Mu Lee
OOD
SupR
MQ
21
41
0
21 Dec 2020
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces
Bert Moons
Parham Noorzad
Andrii Skliar
G. Mariani
Dushyant Mehta
Chris Lott
Tijmen Blankevoort
145
43
0
16 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,567
0
17 Apr 2017
Previous
12345