HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

10 November 2019

Zhen Dong

Papers citing "HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks"

21 / 171 papers shown

Title
I-BERT: Integer-only BERT Quantization Sehoon Kim A. Gholami Z. Yao Michael W. Mahoney Kurt Keutzer MQ 107 344 0 05 Jan 2021
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image Super-Resolution Networks Chee Hong Heewon Kim Sungyong Baik Junghun Oh Kyoung Mu Lee OOD SupR MQ 24 41 0 21 Dec 2020
Exploring Neural Networks Quantization via Layer-Wise Quantization Analysis Shachar Gluska Mark Grobman MQ 19 5 0 15 Dec 2020
A Tiny CNN Architecture for Medical Face Mask Detection for Resource-Constrained Endpoints P. Mohan A. Paul Abhay Chirania CVBM 24 48 0 30 Nov 2020
HAWQV3: Dyadic Neural Network Quantization Z. Yao Zhen Dong Zhangcheng Zheng A. Gholami Jiali Yu ... Leyuan Wang Qijing Huang Yida Wang Michael W. Mahoney Kurt Keutzer MQ 22 87 0 20 Nov 2020
Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks Julieta Martinez Jashan Shewakramani Ting Liu Ioan Andrei Bârsan Wenyuan Zeng R. Urtasun MQ 23 30 0 29 Oct 2020
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks Jianfei Chen Yujie Gai Z. Yao Michael W. Mahoney Joseph E. Gonzalez MQ 17 58 0 27 Oct 2020
Channel-wise Hessian Aware trace-Weighted Quantization of Neural Networks Xu Qian Victor Li Darren Crews MQ 24 9 0 19 Aug 2020
Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers Manuele Rusci Marco Fariselli Alessandro Capotondi Luca Benini MQ 24 17 0 12 Aug 2020
Differentiable Joint Pruning and Quantization for Hardware Efficiency Ying Wang Yadong Lu Tijmen Blankevoort MQ 30 72 0 20 Jul 2020
HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs H. Habi Roy H. Jennings Arnon Netzer MQ 29 65 0 20 Jul 2020
CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs Zhen Dong Dequan Wang Qijing Huang Yizhao Gao Yaohui Cai Tian Li Bichen Wu Kurt Keutzer J. Wawrzynek ObjD 31 1 0 12 Jun 2020
Generative Design of Hardware-aware DNNs Sheng-Chun Kao Arun Ramamurthy T. Krishna MQ 19 2 0 06 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning Z. Yao A. Gholami Sheng Shen Mustafa Mustafa Kurt Keutzer Michael W. Mahoney ODL 39 275 0 01 Jun 2020
Bayesian Bits: Unifying Quantization and Pruning M. V. Baalen Christos Louizos Markus Nagel Rana Ali Amjad Ying Wang Tijmen Blankevoort Max Welling MQ 18 114 0 14 May 2020
Neural Network Compression Framework for fast model inference Alexander Kozlov Ivan Lazarevich Vasily Shamporov N. Lyalyushkin Yury Gorbachev 36 35 0 20 Feb 2020
ZeroQ: A Novel Zero Shot Quantization Framework Yaohui Cai Z. Yao Zhen Dong A. Gholami Michael W. Mahoney Kurt Keutzer MQ 38 389 0 01 Jan 2020
PyHessian: Neural Networks Through the Lens of the Hessian Z. Yao A. Gholami Kurt Keutzer Michael W. Mahoney ODL 24 290 0 16 Dec 2019
OverQ: Opportunistic Outlier Quantization for Neural Network Accelerators Ritchie Zhao Jordan Dotzel Zhanqiu Hu Preslav Ivanov Christopher De Sa Zhiru Zhang MQ 24 1 0 13 Oct 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 236 576 0 12 Sep 2019
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights Aojun Zhou Anbang Yao Yiwen Guo Lin Xu Yurong Chen MQ 337 1,049 0 10 Feb 2017