ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.08295
  4. Cited By
A White Paper on Neural Network Quantization

A White Paper on Neural Network Quantization

15 June 2021
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
    MQ
ArXiv (abs)PDFHTML

Papers citing "A White Paper on Neural Network Quantization"

50 / 264 papers shown
Title
TransAxx: Efficient Transformers with Approximate Computing
TransAxx: Efficient Transformers with Approximate Computing
Dimitrios Danopoulos
Georgios Zervakis
Dimitrios Soudris
Jörg Henkel
ViT
113
2
0
12 Feb 2024
FL-NAS: Towards Fairness of NAS for Resource Constrained Devices via
  Large Language Models
FL-NAS: Towards Fairness of NAS for Resource Constrained Devices via Large Language Models
Ruiyang Qin
Yuting Hu
Zheyu Yan
Jinjun Xiong
Ahmed Abbasi
Yiyu Shi
66
7
0
09 Feb 2024
LQER: Low-Rank Quantization Error Reconstruction for LLMs
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang
Jianyi Cheng
George A. Constantinides
Yiren Zhao
MQ
97
15
0
04 Feb 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
246
18
0
30 Jan 2024
HEQuant: Marrying Homomorphic Encryption and Quantization for
  Communication-Efficient Private Inference
HEQuant: Marrying Homomorphic Encryption and Quantization for Communication-Efficient Private Inference
Tianshi Xu
Meng Li
Runsheng Wang
81
1
0
29 Jan 2024
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object
  Detection
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection
Sifan Zhou
Liang Li
Xinyu Zhang
Bo Zhang
Shipeng Bai
Miao Sun
Ziyu Zhao
Xiaobo Lu
Xiangxiang Chu
MQ
79
14
0
29 Jan 2024
CoSS: Co-optimizing Sensor and Sampling Rate for Data-Efficient AI in
  Human Activity Recognition
CoSS: Co-optimizing Sensor and Sampling Rate for Data-Efficient AI in Human Activity Recognition
Mengxi Liu
Zimin Zhao
Daniel Geissler
Bo Zhou
Sungho Suh
P. Lukowicz
58
0
0
03 Jan 2024
Attention, Distillation, and Tabularization: Towards Practical Neural
  Network-Based Prefetching
Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching
Pengmiao Zhang
Neelesh Gupta
Rajgopal Kannan
Viktor K. Prasanna
69
3
0
23 Dec 2023
SCoTTi: Save Computation at Training Time with an adaptive framework
SCoTTi: Save Computation at Training Time with an adaptive framework
Ziyu Li
Enzo Tartaglione
Van-Tam Nguyen
90
0
0
19 Dec 2023
Post-Training Quantization for Re-parameterization via Coarse & Fine
  Weight Splitting
Post-Training Quantization for Re-parameterization via Coarse & Fine Weight Splitting
Dawei Yang
Ning He
Xing Hu
Zhihang Yuan
Jiangyong Yu
Chen Xu
Zhe Jiang
MQ
90
7
0
17 Dec 2023
CBQ: Cross-Block Quantization for Large Language Models
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
129
17
0
13 Dec 2023
MaxQ: Multi-Axis Query for N:M Sparsity Network
MaxQ: Multi-Axis Query for N:M Sparsity Network
Jingyang Xiang
Siqi Li
Junhao Chen
Zhuangzhi Chen
Tianxin Huang
Linpeng Peng
Yong-Jin Liu
53
0
0
12 Dec 2023
Stateful Large Language Model Serving with Pensieve
Stateful Large Language Model Serving with Pensieve
Lingfan Yu
Jinyang Li
RALMKELMLLMAG
77
15
0
09 Dec 2023
MoEC: Mixture of Experts Implicit Neural Compression
MoEC: Mixture of Experts Implicit Neural Compression
Jianchen Zhao
Cheng-Ching Tseng
Ming Lu
Ruichuan An
Xiaobao Wei
He Sun
Shanghang Zhang
81
3
0
03 Dec 2023
The Cost of Compression: Investigating the Impact of Compression on
  Parametric Knowledge in Language Models
The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models
Srinath Namburi
Makesh Narsimhan Sreedhar
Srinath Srinivasan
Frederic Sala
MQ
63
11
0
01 Dec 2023
The Trifecta: Three simple techniques for training deeper
  Forward-Forward networks
The Trifecta: Three simple techniques for training deeper Forward-Forward networks
Thomas Dooms
Ing Jyh Tsang
José Oramas
69
4
0
29 Nov 2023
Fast and Efficient 2-bit LLM Inference on GPU: 2/4/16-bit in a Weight
  Matrix with Asynchronous Dequantization
Fast and Efficient 2-bit LLM Inference on GPU: 2/4/16-bit in a Weight Matrix with Asynchronous Dequantization
Jinhao Li
Jiaming Xu
Shiyao Li
Shan Huang
Jun Liu
Yaoxiu Lian
Guohao Dai
MQ
59
3
0
28 Nov 2023
Hybrid Synaptic Structure for Spiking Neural Network Realization
Hybrid Synaptic Structure for Spiking Neural Network Realization
S. Razmkhah
M. A. Karamuftuoglu
A. Bozbey
46
5
0
13 Nov 2023
Post-training Quantization for Text-to-Image Diffusion Models with
  Progressive Calibration and Activation Relaxing
Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing
Siao Tang
Xin Wang
Hong Chen
Chaoyu Guan
Zewen Wu
Yansong Tang
Wenwu Zhu
MQ
97
16
0
10 Nov 2023
Reducing the Side-Effects of Oscillations in Training of Quantized YOLO
  Networks
Reducing the Side-Effects of Oscillations in Training of Quantized YOLO Networks
Kartik Gupta
Akshay Asthana
MQ
36
8
0
09 Nov 2023
Fully Quantized Always-on Face Detector Considering Mobile Image Sensors
Fully Quantized Always-on Face Detector Considering Mobile Image Sensors
Haechang Lee
Wongi Jeong
Dongil Ryu
Hyunwoo Je
Albert No
Kijeong Kim
Se Young Chun
CVBM
59
0
0
02 Nov 2023
Exploring Post-Training Quantization of Protein Language Models
Exploring Post-Training Quantization of Protein Language Models
Shuang Peng
Fei Yang
Ning Sun
Sheng Chen
Yanfeng Jiang
Aimin Pan
MQ
52
0
0
30 Oct 2023
QWID: Quantized Weed Identification Deep neural network
QWID: Quantized Weed Identification Deep neural network
Parikshit Singh Rathore
MQ
46
0
0
29 Oct 2023
MOSEL: Inference Serving Using Dynamic Modality Selection
MOSEL: Inference Serving Using Dynamic Modality Selection
Bodun Hu
Le Xu
Jeongyoon Moon
N. Yadwadkar
Aditya Akella
60
4
0
27 Oct 2023
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
Elias Frantar
Dan Alistarh
MQMoE
84
29
0
25 Oct 2023
Projected Stochastic Gradient Descent with Quantum Annealed Binary
  Gradients
Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients
Maximilian Krahn
Michele Sasdelli
Fengyi Yang
Vladislav Golyanik
Arno Solin
Tat-Jun Chin
Tolga Birdal
MQ
170
2
0
23 Oct 2023
Exploiting Activation Sparsity with Dense to Dynamic-k
  Mixture-of-Experts Conversion
Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion
Filip Szatkowski
Eric Elmoznino
Younesse Kaddar
Simone Scardapane
MoE
60
6
0
06 Oct 2023
A Study of Quantisation-aware Training on Time Series Transformer Models
  for Resource-constrained FPGAs
A Study of Quantisation-aware Training on Time Series Transformer Models for Resource-constrained FPGAs
Tianheng Ling
Chao Qian
Lukas Einhaus
Gregor Schiele
31
1
0
04 Oct 2023
MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
T. V. Rozendaal
Tushar Singhal
Hoang Le
Guillaume Sautière
Amir Said
...
Hitarth Mehta
Frank Mayer
Liang Zhang
Markus Nagel
Auke Wiggers
102
11
0
02 Oct 2023
On Calibration of Modern Quantized Efficient Neural Networks
On Calibration of Modern Quantized Efficient Neural Networks
Joe-Hwa Kuang
Alexander Wong
UQCVMQ
137
1
0
25 Sep 2023
DeepliteRT: Computer Vision at the Edge
DeepliteRT: Computer Vision at the Edge
Saad Ashfaq
Alexander Hoffman
Saptarshi Mitra
Sudhakar Sah
Mohammadhossein Askarihemmat
Ehsan Saboori
VLMMQ
105
1
0
19 Sep 2023
Accelerating Deep Neural Networks via Semi-Structured Activation
  Sparsity
Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity
Matteo Grimaldi
Darshan C. Ganji
Ivan Lazarevich
Sudhakar Sah
61
10
0
12 Sep 2023
EDAC: Efficient Deployment of Audio Classification Models For COVID-19
  Detection
EDAC: Efficient Deployment of Audio Classification Models For COVID-19 Detection
Andrej Jovanović
Mario Mihaly
Lennon Donaldson
70
0
0
11 Sep 2023
Softmax Bias Correction for Quantized Generative Models
Softmax Bias Correction for Quantized Generative Models
N. Pandey
Marios Fournarakis
Chirag I. Patel
Markus Nagel
DiffM
68
11
0
04 Sep 2023
FPTQ: Fine-grained Post-Training Quantization for Large Language Models
FPTQ: Fine-grained Post-Training Quantization for Large Language Models
Qingyuan Li
Yifan Zhang
Liang Li
Peng Yao
Bo Zhang
Xiangxiang Chu
Yerui Sun
Li-Qiang Du
Yuchen Xie
MQ
108
13
0
30 Aug 2023
ResQ: Residual Quantization for Video Perception
ResQ: Residual Quantization for Video Perception
Davide Abati
H. Yahia
Markus Nagel
A. Habibian
MQ
38
2
0
18 Aug 2023
EQ-Net: Elastic Quantization Neural Networks
EQ-Net: Elastic Quantization Neural Networks
Ke Xu
Lei Han
Ye Tian
Shangshang Yang
Xingyi Zhang
MQ
124
10
0
15 Aug 2023
Quantization Aware Factorization for Deep Neural Network Compression
Quantization Aware Factorization for Deep Neural Network Compression
Daria Cherniuk
Stanislav Abukhovich
Anh-Huy Phan
Ivan Oseledets
A. Cichocki
Julia Gusak
MQ
74
3
0
08 Aug 2023
Efficient neural supersampling on a novel gaming dataset
Efficient neural supersampling on a novel gaming dataset
Antoine Mercier
Ruan Erasmus
Yash Savani
Manik Dhingra
Fatih Porikli
Guillaume Berger
SupR
66
2
0
03 Aug 2023
Tango: rethinking quantization for graph neural network training on GPUs
Tango: rethinking quantization for graph neural network training on GPUs
Shiyang Chen
Da Zheng
Caiwen Ding
Chengying Huan
Yuede Ji
Hang Liu
GNNMQ
62
6
0
02 Aug 2023
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
Manasa Manohara
Sankalp Dayal
Tarqi Afzal
Rahul Bakshi
Kahkuen Fu
MQ
52
0
0
01 Aug 2023
An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks
An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks
Ye Tao
Wanwei Liu
Fu Song
Zhen Liang
Jing Wang
Hongxu Zhu
55
1
0
29 Jul 2023
Quantized Feature Distillation for Network Quantization
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
62
11
0
20 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
49
5
0
10 Jul 2023
Pruning vs Quantization: Which is Better?
Pruning vs Quantization: Which is Better?
Andrey Kuzmin
Markus Nagel
M. V. Baalen
Arash Behboodi
Tijmen Blankevoort
MQ
133
55
0
06 Jul 2023
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
Weiming Zhuang
Chen Chen
Lingjuan Lyu
Chong Chen
Yaochu Jin
Lingjuan Lyu
AIFinAI4CE
223
98
0
27 Jun 2023
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and
  Customized Hardware
A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
Shichang Zhang
Atefeh Sohrabizadeh
Cheng Wan
Zijie Huang
Ziniu Hu
Yewen Wang
Yingyan Lin
Lin
Jason Cong
Yizhou Sun
GNNAI4CE
98
25
0
24 Jun 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
123
93
0
22 Jun 2023
Resource Efficient Neural Networks Using Hessian Based Pruning
Resource Efficient Neural Networks Using Hessian Based Pruning
J. Chong
Manas Gupta
Lihui Chen
59
3
0
12 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDLMQ
173
587
0
01 Jun 2023
Previous
123456
Next