Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.07877
Cited By
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
14 May 2022
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification"
36 / 36 papers shown
Title
Automatic mixed precision for optimizing gained time with constrained loss mean-squared-error based on model partition to sequential sub-graphs
Shmulik Markovich-Golan
Daniel Ohayon
Itay Niv
Yair Hanani
MQ
66
0
0
19 May 2025
Hardware-Aware DNN Compression for Homogeneous Edge Devices
Kunlong Zhang
Guiying Li
Ning Lu
Peng Yang
K. Tang
82
0
0
28 Jan 2025
Quantized symbolic time series approximation
Erin Carson
Xinye Chen
Cheng Kang
AI4TS
112
0
0
20 Nov 2024
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Jun Chen
Shipeng Bai
Tianxin Huang
Mengmeng Wang
Guanzhong Tian
Y. Liu
MQ
47
18
0
02 Jul 2023
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction
Yuhang Li
Ruihao Gong
Xu Tan
Yang Yang
Peng Hu
Qi Zhang
F. Yu
Wei Wang
Shi Gu
MQ
95
426
0
10 Feb 2021
Differentiable Joint Pruning and Quantization for Hardware Efficiency
Ying Wang
Yadong Lu
Tijmen Blankevoort
MQ
38
72
0
20 Jul 2020
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
Tianzhe Wang
Kuan-Chieh Wang
Han Cai
Ji Lin
Zhijian Liu
Song Han
MQ
46
174
0
15 Jun 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
34
563
0
22 Apr 2020
Binary Neural Networks: A Survey
Haotong Qin
Ruihao Gong
Xianglong Liu
Xiao Bai
Jingkuan Song
N. Sebe
MQ
83
463
0
31 Mar 2020
Towards Efficient Training for Neural Network Quantization
Qing Jin
Linjie Yang
Zhenyu A. Liao
MQ
55
42
0
21 Dec 2019
Adaptive Loss-aware Quantization for Multi-bit Networks
Zhongnan Qu
Zimu Zhou
Yun Cheng
Lothar Thiele
MQ
111
55
0
18 Dec 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
65
277
0
10 Nov 2019
Straight-Through Estimator as Projected Wasserstein Gradient Flow
Pengyu Cheng
YooJung Choi
Yitao Liang
Dinghan Shen
Ricardo Henao
Guy Van den Broeck
50
14
0
05 Oct 2019
Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks
Yuhang Li
Xin Dong
Wei Wang
MQ
50
255
0
28 Sep 2019
Learned Step Size Quantization
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
50
792
0
21 Feb 2019
Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search
Bichen Wu
Yanghan Wang
Peizhao Zhang
Yuandong Tian
Peter Vajda
Kurt Keutzer
MQ
56
272
0
30 Nov 2018
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
MQ
95
876
0
21 Nov 2018
Bi-Real Net: Binarizing Deep Network Towards Real-Network Performance
Zechun Liu
Wenhan Luo
Baoyuan Wu
Xin Yang
Wen Liu
K. Cheng
MQ
37
92
0
04 Nov 2018
Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks
Penghang Yin
Shuai Zhang
J. Lyu
Stanley Osher
Y. Qi
Jack Xin
MQ
67
62
0
15 Aug 2018
PACT: Parameterized Clipping Activation for Quantized Neural Networks
Jungwook Choi
Zhuo Wang
Swagath Venkataramani
P. Chuang
Vijayalakshmi Srinivasan
K. Gopalakrishnan
MQ
40
945
0
16 May 2018
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy
Asit K. Mishra
Debbie Marr
FedML
57
330
0
15 Nov 2017
To prune, or not to prune: exploring the efficacy of pruning for model compression
Michael Zhu
Suyog Gupta
118
1,262
0
05 Oct 2017
WRPN: Wide Reduced-Precision Networks
Asit K. Mishra
Eriko Nurvitadhi
Jeffrey J. Cook
Debbie Marr
MQ
52
267
0
04 Sep 2017
Channel Pruning for Accelerating Very Deep Neural Networks
Yihui He
Xiangyu Zhang
Jian Sun
189
2,513
0
19 Jul 2017
Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
Shuchang Zhou
Yuzhi Wang
He Wen
Qinyao He
Yuheng Zou
MQ
70
110
0
22 Jun 2017
Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Vivienne Sze
Yu-hsin Chen
Tien-Ju Yang
J. Emer
AAML
3DV
94
3,002
0
27 Mar 2017
Pruning Filters for Efficient ConvNets
Hao Li
Asim Kadav
Igor Durdanovic
H. Samet
H. Graf
3DPC
155
3,676
0
31 Aug 2016
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
Mohammad Rastegari
Vicente Ordonez
Joseph Redmon
Ali Farhadi
MQ
129
4,342
0
16 Mar 2016
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
F. Iandola
Song Han
Matthew W. Moskewicz
Khalid Ashraf
W. Dally
Kurt Keutzer
110
7,448
0
24 Feb 2016
Neural Networks with Few Multiplications
Zhouhan Lin
Matthieu Courbariaux
Roland Memisevic
Yoshua Bengio
56
331
0
11 Oct 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
189
8,793
0
01 Oct 2015
Tensorizing Neural Networks
Alexander Novikov
D. Podoprikhin
A. Osokin
Dmitry Vetrov
71
879
0
22 Sep 2015
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
212
6,628
0
08 Jun 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
151
18,534
0
06 Feb 2015
Techniques for Learning Binary Stochastic Feedforward Neural Networks
T. Raiko
Mathias Berglund
Guillaume Alain
Laurent Dinh
BDL
80
126
0
11 Jun 2014
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
Emily L. Denton
Wojciech Zaremba
Joan Bruna
Yann LeCun
Rob Fergus
FAtt
111
1,682
0
02 Apr 2014
1