Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.08295
Cited By
A White Paper on Neural Network Quantization
15 June 2021
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A White Paper on Neural Network Quantization"
50 / 247 papers shown
Title
Reducing the Side-Effects of Oscillations in Training of Quantized YOLO Networks
Kartik Gupta
Akshay Asthana
MQ
24
8
0
09 Nov 2023
Fully Quantized Always-on Face Detector Considering Mobile Image Sensors
Haechang Lee
Wongi Jeong
Dongil Ryu
Hyunwoo Je
Albert No
Kijeong Kim
Se Young Chun
CVBM
31
0
0
02 Nov 2023
Exploring Post-Training Quantization of Protein Language Models
Shuang Peng
Fei Yang
Ning Sun
Sheng Chen
Yanfeng Jiang
Aimin Pan
MQ
27
0
0
30 Oct 2023
QWID: Quantized Weed Identification Deep neural network
Parikshit Singh Rathore
MQ
11
0
0
29 Oct 2023
MOSEL: Inference Serving Using Dynamic Modality Selection
Bodun Hu
Le Xu
Jeongyoon Moon
N. Yadwadkar
Aditya Akella
13
4
0
27 Oct 2023
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
Elias Frantar
Dan Alistarh
MQ
MoE
29
24
0
25 Oct 2023
Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients
Maximilian Krahn
Michele Sasdelli
Fengyi Yang
Vladislav Golyanik
Arno Solin
Tat-Jun Chin
Tolga Birdal
MQ
87
2
0
23 Oct 2023
A Study of Quantisation-aware Training on Time Series Transformer Models for Resource-constrained FPGAs
Tianheng Ling
Chao Qian
Lukas Einhaus
Gregor Schiele
11
1
0
04 Oct 2023
MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
T. V. Rozendaal
Tushar Singhal
Hoang Le
Guillaume Sautière
Amir Said
...
Hitarth Mehta
Frank Mayer
Liang Zhang
Markus Nagel
Auke Wiggers
49
11
0
02 Oct 2023
On Calibration of Modern Quantized Efficient Neural Networks
Joe-Hwa Kuang
Alexander Wong
UQCV
MQ
27
1
0
25 Sep 2023
DeepliteRT: Computer Vision at the Edge
Saad Ashfaq
Alexander Hoffman
Saptarshi Mitra
Sudhakar Sah
Mohammadhossein Askarihemmat
Ehsan Saboori
VLM
MQ
29
0
0
19 Sep 2023
Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity
Matteo Grimaldi
Darshan C. Ganji
Ivan Lazarevich
Sudhakar Sah
14
10
0
12 Sep 2023
EDAC: Efficient Deployment of Audio Classification Models For COVID-19 Detection
Andrej Jovanović
Mario Mihaly
Lennon Donaldson
36
0
0
11 Sep 2023
Softmax Bias Correction for Quantized Generative Models
N. Pandey
Marios Fournarakis
Chirag I. Patel
Markus Nagel
DiffM
17
11
0
04 Sep 2023
FPTQ: Fine-grained Post-Training Quantization for Large Language Models
Qingyuan Li
Yifan Zhang
Liang Li
Peng Yao
Bo-Wen Zhang
Xiangxiang Chu
Yerui Sun
Li-Qiang Du
Yuchen Xie
MQ
42
12
0
30 Aug 2023
ResQ: Residual Quantization for Video Perception
Davide Abati
H. Yahia
Markus Nagel
A. Habibian
MQ
21
2
0
18 Aug 2023
EQ-Net: Elastic Quantization Neural Networks
Ke Xu
Lei Han
Ye Tian
Shangshang Yang
Xingyi Zhang
MQ
43
7
0
15 Aug 2023
Quantization Aware Factorization for Deep Neural Network Compression
Daria Cherniuk
Stanislav Abukhovich
Anh-Huy Phan
Ivan Oseledets
A. Cichocki
Julia Gusak
MQ
23
2
0
08 Aug 2023
Efficient neural supersampling on a novel gaming dataset
Antoine Mercier
Ruan Erasmus
Yash Savani
Manik Dhingra
Fatih Porikli
Guillaume Berger
SupR
34
1
0
03 Aug 2023
Tango: rethinking quantization for graph neural network training on GPUs
Shiyang Chen
Da Zheng
Caiwen Ding
Chengying Huan
Yuede Ji
Hang Liu
GNN
MQ
31
5
0
02 Aug 2023
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
Manasa Manohara
Sankalp Dayal
Tarqi Afzal
Rahul Bakshi
Kahkuen Fu
MQ
22
0
0
01 Aug 2023
An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks
Ye Tao
Wanwei Liu
Fu Song
Zhen Liang
J. Wang
Hongxu Zhu
28
1
0
29 Jul 2023
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
29
9
0
20 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
27
5
0
10 Jul 2023
Pruning vs Quantization: Which is Better?
Andrey Kuzmin
Markus Nagel
M. V. Baalen
Arash Behboodi
Tijmen Blankevoort
MQ
27
48
0
06 Jul 2023
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
Weiming Zhuang
Chen Chen
Lingjuan Lyu
Chong Chen
Yaochu Jin
Lingjuan Lyu
AIFin
AI4CE
99
85
0
27 Jun 2023
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Shichang Zhang
Atefeh Sohrabizadeh
Cheng Wan
Zijie Huang
Ziniu Hu
Yewen Wang
Yingyan Lin
Lin
Jason Cong
Yizhou Sun
GNN
AI4CE
34
23
0
24 Jun 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
21
87
0
22 Jun 2023
Resource Efficient Neural Networks Using Hessian Based Pruning
J. Chong
Manas Gupta
Lihui Chen
22
2
0
12 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
36
470
0
01 Jun 2023
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
J. H. Lee
Jeonghoon Kim
S. Kwon
Dongsoo Lee
MQ
28
33
0
01 Jun 2023
Intriguing Properties of Quantization at Scale
Arash Ahmadian
Saurabh Dash
Hongyu Chen
Bharat Venkitesh
Stephen Gou
Phil Blunsom
Ahmet Üstün
Sara Hooker
MQ
48
38
0
30 May 2023
Binary stochasticity enabled highly efficient neuromorphic deep learning achieves better-than-software accuracy
Yang Li
Wei Wang
Ming Wang
C. Dou
Zhengyu Ma
...
Guanhua Yang
Feng Zhang
Ling Li
Daniele Ielmini
Ming-Yu Liu
18
5
0
25 Apr 2023
Improving Post-Training Quantization on Object Detection with Task Loss-Guided Lp Metric
Lin Niu
Jia-Wen Liu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
33
2
0
19 Apr 2023
Arrhythmia Classifier Based on Ultra-Lightweight Binary Neural Network
Ninghao Pu
Zhong-Li Wu
Ao Wang
Hanshi Sun
Zijing Liu
Hao Liu
MQ
19
4
0
04 Apr 2023
FP8 versus INT8 for efficient deep learning inference
M. V. Baalen
Andrey Kuzmin
Suparna S. Nair
Yuwei Ren
E. Mahurin
...
Sundar Subramanian
Sanghyuk Lee
Markus Nagel
Joseph B. Soriaga
Tijmen Blankevoort
MQ
28
44
0
31 Mar 2023
Tetra-AML: Automatic Machine Learning via Tensor Networks
A. Naumov
Ar. Melnikov
V. Abronin
F. Oxanichenko
K. Izmailov
M. Pflitsch
A. Melnikov
M. Perelshtein
21
11
0
28 Mar 2023
Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance
Zhihang Yuan
Jiawei Liu
Jiaxiang Wu
Dawei Yang
Qiang Wu
Guangyu Sun
Wenyu Liu
Xinggang Wang
Bingzhe Wu
MQ
22
6
0
23 Mar 2023
Low Rank Optimization for Efficient Deep Learning: Making A Balance between Compact Architecture and Fast Training
Xinwei Ou
Zhangxin Chen
Ce Zhu
Yipeng Liu
36
4
0
22 Mar 2023
Unit Scaling: Out-of-the-Box Low-Precision Training
Charlie Blake
Douglas Orr
Carlo Luschi
MQ
24
7
0
20 Mar 2023
MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video
Renat Bashirov
A. Larionov
E. Ustinova
Mikhail Sidorenko
D. Svitov
Ilya Zakharkin
Victor Lempitsky
3DH
29
3
0
17 Mar 2023
Operating critical machine learning models in resource constrained regimes
Raghavendra Selvan
Julian Schon
Erik Dam
MedIm
33
8
0
17 Mar 2023
QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms
Guillaume Berger
Manik Dhingra
Antoine Mercier
Yash Savani
Sunny Panchal
Fatih Porikli
SupR
20
5
0
08 Mar 2023
RQAT-INR: Improved Implicit Neural Image Compression
B. Damodaran
M. Balcilar
Franck Galpin
Pierre Hellier
27
8
0
06 Mar 2023
Hierarchical Training of Deep Neural Networks Using Early Exiting
Yamin Sepehri
P. Pad
A. C. Yüzügüler
P. Frossard
L. A. Dunbar
28
7
0
04 Mar 2023
Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators
Malte J. Rasch
C. Mackin
Manuel Le Gallo
An Chen
A. Fasoli
...
P. Narayanan
H. Tsai
G. Burr
Abu Sebastian
Vijay Narayanan
13
83
0
16 Feb 2023
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
22
2
0
15 Feb 2023
A Practical Mixed Precision Algorithm for Post-Training Quantization
N. Pandey
Markus Nagel
M. V. Baalen
Yin-Ruey Huang
Chirag I. Patel
Tijmen Blankevoort
MQ
16
19
0
10 Feb 2023
Q-Diffusion: Quantizing Diffusion Models
Xiuyu Li
Yijia Liu
Long Lian
Hua Yang
Zhen Dong
Daniel Kang
Shanghang Zhang
Kurt Keutzer
DiffM
MQ
38
154
0
08 Feb 2023
Training with Mixed-Precision Floating-Point Assignments
Wonyeol Lee
Rahul Sharma
A. Aiken
MQ
26
2
0
31 Jan 2023
Previous
1
2
3
4
5
Next