Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.07145
Cited By
Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
22 June 2017
Shuchang Zhou
Yuzhi Wang
He Wen
Qinyao He
Yuheng Zou
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks"
22 / 22 papers shown
Title
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
Armand Foucault
Franck Mamalet
François Malgouyres
MQ
74
0
0
28 Jan 2025
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Yunfan LU
Kurt Keutzer
Jianfei Chen
Song Han
MQ
75
9
0
25 Oct 2024
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
36
13
0
13 Dec 2023
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks
Cheng Gong
Ye Lu
Surong Dai
Deng Qian
Chenkun Du
Tao Li
MQ
29
0
0
07 Apr 2023
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
21
2
0
11 Oct 2022
Limitations of neural network training due to numerical instability of backpropagation
Clemens Karner
V. Kazeev
P. Petersen
32
3
0
03 Oct 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
20
6
0
22 Mar 2022
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
150
675
0
24 Jan 2021
Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks
Yoonho Boo
Sungho Shin
Jungwook Choi
Wonyong Sung
MQ
30
29
0
30 Sep 2020
Exploring the Connection Between Binary and Spiking Neural Networks
Sen Lu
Abhronil Sengupta
MQ
14
100
0
24 Feb 2020
Towards Efficient Training for Neural Network Quantization
Qing Jin
Linjie Yang
Zhenyu A. Liao
MQ
16
42
0
21 Dec 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks
Ruihao Gong
Xianglong Liu
Shenghu Jiang
Tian-Hao Li
Peng Hu
Jiazhen Lin
F. Yu
Junjie Yan
MQ
32
446
0
14 Aug 2019
GDRQ: Group-based Distribution Reshaping for Quantization
Haibao Yu
Tuopu Wen
Guangliang Cheng
Jiankai Sun
Qi Han
Jianping Shi
MQ
33
3
0
05 Aug 2019
Recurrent Neural Networks: An Embedded Computing Perspective
Nesma M. Rezk
M. Purnaprajna
Tomas Nordstrom
Z. Ul-Abdin
35
81
0
23 Jul 2019
Constructing Energy-efficient Mixed-precision Neural Networks through Principal Component Analysis for Edge Intelligence
I. Chakraborty
Deboleena Roy
Isha Garg
Aayush Ankit
Kaushik Roy
21
37
0
04 Jun 2019
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Jungwook Choi
P. Chuang
Zhuo Wang
Swagath Venkataramani
Vijayalakshmi Srinivasan
K. Gopalakrishnan
MQ
13
75
0
17 Jul 2018
FINN-L: Library Extensions and Design Trade-off Analysis for Variable Precision LSTM Networks on FPGAs
Vladimir Rybalkin
Alessandro Pappalardo
M. M. Ghaffar
Giulio Gambardella
Norbert Wehn
Michaela Blott
11
72
0
11 Jul 2018
Retraining-Based Iterative Weight Quantization for Deep Neural Networks
Dongsoo Lee
Byeongwook Kim
MQ
33
16
0
29 May 2018
Accelerating CNN inference on FPGAs: A Survey
K. Abdelouahab
Maxime Pelcat
Jocelyn Serot
F. Berry
AI4CE
27
147
0
26 May 2018
PACT: Parameterized Clipping Activation for Quantized Neural Networks
Jungwook Choi
Zhuo Wang
Swagath Venkataramani
P. Chuang
Vijayalakshmi Srinivasan
K. Gopalakrishnan
MQ
13
936
0
16 May 2018
UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks
Chaim Baskin
Eli Schwartz
Evgenii Zheltonozhskii
Natan Liss
Raja Giryes
A. Bronstein
A. Mendelson
MQ
19
45
0
29 Apr 2018
Value-aware Quantization for Training and Inference of Neural Networks
Eunhyeok Park
S. Yoo
Peter Vajda
MQ
14
158
0
20 Apr 2018
1