ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.03852
  4. Cited By
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

10 November 2019
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
    MQ
ArXivPDFHTML

Papers citing "HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks"

50 / 171 papers shown
Title
FIT: A Metric for Model Sensitivity
FIT: A Metric for Model Sensitivity
Ben Zandonati
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
24
8
0
16 Oct 2022
Analysis of Quantization on MLP-based Vision Models
Analysis of Quantization on MLP-based Vision Models
Lingran Zhao
Zhen Dong
Kurt Keutzer
MQ
32
7
0
14 Sep 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural
  Network Quantization
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
Cong Guo
Chen Zhang
Jingwen Leng
Zihan Liu
Fan Yang
Yun-Bo Liu
Minyi Guo
Yuhao Zhu
MQ
20
58
0
30 Aug 2022
Mixed-Precision Neural Networks: A Survey
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
25
11
0
11 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust
  Quantization
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
23
2
0
31 Jul 2022
Mixed-Precision Inference Quantization: Radically Towards Faster
  inference speed, Lower Storage requirement, and Lower Loss
Mixed-Precision Inference Quantization: Radically Towards Faster inference speed, Lower Storage requirement, and Lower Loss
Daning Cheng
Wenguang Chen
MQ
29
0
0
20 Jul 2022
Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for
  Natural Language Understanding
Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding
Connor Holmes
Minjia Zhang
Yuxiong He
Bo Wu
25
3
0
30 Jun 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System
  on the 300-hr Switchboard Corpus
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus
Junhao Xu
Shoukang Hu
Xunying Liu
Helen M. Meng
MQ
19
5
0
23 Jun 2022
Edge Inference with Fully Differentiable Quantized Mixed Precision
  Neural Networks
Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks
Clemens J. S. Schaefer
Siddharth Joshi
Shane Li
Raul Blazquez
MQ
33
9
0
15 Jun 2022
QONNX: Representing Arbitrary-Precision Quantized Neural Networks
QONNX: Representing Arbitrary-Precision Quantized Neural Networks
Alessandro Pappalardo
Yaman Umuroglu
Michaela Blott
Jovan Mitrevski
B. Hawks
...
J. Muhizi
Matthew Trahms
Shih-Chieh Hsu
Scott Hauck
Javier Mauricio Duarte
MQ
24
18
0
15 Jun 2022
SDQ: Stochastic Differentiable Quantization with Mixed Precision
SDQ: Stochastic Differentiable Quantization with Mixed Precision
Xijie Huang
Zhiqiang Shen
Shichao Li
Zechun Liu
Xianghong Hu
Jeffry Wicaksana
Eric P. Xing
Kwang-Ting Cheng
MQ
27
33
0
09 Jun 2022
NIPQ: Noise proxy-based Integrated Pseudo-Quantization
NIPQ: Noise proxy-based Integrated Pseudo-Quantization
Juncheol Shin
Junhyuk So
Sein Park
Seungyeop Kang
S. Yoo
Eunhyeok Park
25
27
0
02 Jun 2022
AMED: Automatic Mixed-Precision Quantization for Edge Devices
AMED: Automatic Mixed-Precision Quantization for Edge Devices
Moshe Kimhi
T. Rozen
A. Mendelson
Chaim Baskin
MQ
27
3
0
30 May 2022
Wavelet Feature Maps Compression for Image-to-Image CNNs
Wavelet Feature Maps Compression for Image-to-Image CNNs
Shahaf E. Finder
Yair Zohav
Maor Ashkenazi
Eran Treister
22
17
0
24 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in
  Image Classification
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
40
85
0
14 May 2022
UnrealNAS: Can We Search Neural Architectures with Unreal Data?
UnrealNAS: Can We Search Neural Architectures with Unreal Data?
Zhen Dong
Kaichen Zhou
Ge Li
Qiang Zhou
Mingfei Guo
Guohao Li
Kurt Keutzer
Shanghang Zhang
41
0
0
04 May 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit
  Post-Training Quantization
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
19
168
0
11 Mar 2022
Structured Pruning is All You Need for Pruning CNNs at Initialization
Structured Pruning is All You Need for Pruning CNNs at Initialization
Yaohui Cai
Weizhe Hua
Hongzheng Chen
G. E. Suh
Christopher De Sa
Zhiru Zhang
CVBM
49
14
0
04 Mar 2022
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian
  Approximation
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation
Cong Guo
Yuxian Qiu
Jingwen Leng
Xiaotian Gao
Chen Zhang
Yunxin Liu
Fan Yang
Yuhao Zhu
Minyi Guo
MQ
74
70
0
14 Feb 2022
Quantization in Layer's Input is Matter
Quantization in Layer's Input is Matter
Daning Cheng
Wenguang Chen
MQ
11
0
0
10 Feb 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to
  Power Next-Generation AI Scale
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Samyam Rajbhandari
Conglong Li
Z. Yao
Minjia Zhang
Reza Yazdani Aminabadi
A. A. Awan
Jeff Rasley
Yuxiong He
47
286
0
14 Jan 2022
Neural Network Quantization for Efficient Inference: A Survey
Neural Network Quantization for Efficient Inference: A Survey
Olivia Weng
MQ
28
23
0
08 Dec 2021
Mixed Precision Low-bit Quantization of Neural Network Language Models
  for Speech Recognition
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition
Junhao Xu
Jianwei Yu
Shoukang Hu
Xunying Liu
Helen Meng
MQ
30
13
0
29 Nov 2021
Mixed Precision DNN Qunatization for Overlapped Speech Separation and
  Recognition
Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition
Junhao Xu
Jianwei Yu
Xunying Liu
Helen Meng
MQ
36
10
0
29 Nov 2021
Mixed Precision of Quantization of Transformer Language Models for
  Speech Recognition
Mixed Precision of Quantization of Transformer Language Models for Speech Recognition
Junhao Xu
Shoukang Hu
Jianwei Yu
Xunying Liu
Helen M. Meng
MQ
40
15
0
29 Nov 2021
Sharpness-aware Quantization for Deep Neural Networks
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
27
24
0
24 Nov 2021
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Weixin Xu
Zipeng Feng
Shuangkang Fang
Song Yuan
Yi Yang
Shuchang Zhou
MQ
30
1
0
01 Nov 2021
RMSMP: A Novel Deep Neural Network Quantization Framework with Row-wise
  Mixed Schemes and Multiple Precisions
RMSMP: A Novel Deep Neural Network Quantization Framework with Row-wise Mixed Schemes and Multiple Precisions
Sung-En Chang
Yanyu Li
Mengshu Sun
Weiwen Jiang
Sijia Liu
Yanzhi Wang
Xue Lin
MQ
25
10
0
30 Oct 2021
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving
  Adversarial Outcomes
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes
Sanghyun Hong
Michael-Andrei Panaitescu-Liess
Yigitcan Kaya
Tudor Dumitras
MQ
60
13
0
26 Oct 2021
Applications and Techniques for Fast Machine Learning in Science
Applications and Techniques for Fast Machine Learning in Science
A. Deiana
Nhan Tran
Joshua C. Agar
Michaela Blott
G. D. Guglielmo
...
Ashish Sharma
S. Summers
Pietro Vischia
J. Vlimant
Olivia Weng
14
71
0
25 Oct 2021
Towards Mixed-Precision Quantization of Neural Networks via Constrained
  Optimization
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
Weihan Chen
Peisong Wang
Jian Cheng
MQ
44
62
0
13 Oct 2021
OMPQ: Orthogonal Mixed Precision Quantization
OMPQ: Orthogonal Mixed Precision Quantization
Yuexiao Ma
Taisong Jin
Xiawu Zheng
Yan Wang
Huixia Li
Yongjian Wu
Guannan Jiang
Wei Zhang
Rongrong Ji
MQ
19
33
0
16 Sep 2021
DKM: Differentiable K-Means Clustering Layer for Neural Network
  Compression
DKM: Differentiable K-Means Clustering Layer for Neural Network Compression
Minsik Cho
Keivan Alizadeh Vahid
Saurabh N. Adya
Mohammad Rastegari
42
34
0
28 Aug 2021
Generalizable Mixed-Precision Quantization via Attribution Rank
  Preservation
Generalizable Mixed-Precision Quantization via Attribution Rank Preservation
Ziwei Wang
Han Xiao
Jiwen Lu
Jie Zhou
MQ
22
32
0
05 Aug 2021
Machine Learning Advances aiding Recognition and Classification of
  Indian Monuments and Landmarks
Machine Learning Advances aiding Recognition and Classification of Indian Monuments and Landmarks
A. Paul
S. Ghose
K. Aggarwal
Niketha Nethaji
Shivam Pal
Arnab Dutta Purkayastha
23
9
0
29 Jul 2021
Post-Training Quantization for Vision Transformer
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
56
327
0
27 Jun 2021
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision
  Quantization
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization
Santiago Miret
Vui Seng Chua
Mattias Marder
Mariano Phielipp
Nilesh Jain
Somdeb Majumdar
23
8
0
14 Jun 2021
ActNN: Reducing Training Memory Footprint via 2-Bit Activation
  Compressed Training
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Jianfei Chen
Lianmin Zheng
Z. Yao
Dequan Wang
Ion Stoica
Michael W. Mahoney
Joseph E. Gonzalez
MQ
27
74
0
29 Apr 2021
Hessian Aware Quantization of Spiking Neural Networks
Hessian Aware Quantization of Spiking Neural Networks
H. Lui
Emre Neftci
MQ
16
14
0
29 Apr 2021
HAO: Hardware-aware neural Architecture Optimization for Efficient
  Inference
HAO: Hardware-aware neural Architecture Optimization for Efficient Inference
Zhen Dong
Yizhao Gao
Qijing Huang
J. Wawrzynek
Hayden Kwok-Hay So
Kurt Keutzer
19
34
0
26 Apr 2021
Differentiable Model Compression via Pseudo Quantization Noise
Differentiable Model Compression via Pseudo Quantization Noise
Alexandre Défossez
Yossi Adi
Gabriel Synnaeve
DiffM
MQ
18
47
0
20 Apr 2021
TENT: Efficient Quantization of Neural Networks on the tiny Edge with
  Tapered FixEd PoiNT
TENT: Efficient Quantization of Neural Networks on the tiny Edge with Tapered FixEd PoiNT
H. F. Langroudi
Vedant Karia
Tej Pandit
Dhireesha Kudithipudi
MQ
24
10
0
06 Apr 2021
Network Quantization with Element-wise Gradient Scaling
Network Quantization with Element-wise Gradient Scaling
Junghyup Lee
Dohyung Kim
Bumsub Ham
MQ
18
115
0
02 Apr 2021
Data-free mixed-precision quantization using novel sensitivity metric
Data-free mixed-precision quantization using novel sensitivity metric
Donghyun Lee
M. Cho
Seungwon Lee
Joonho Song
Changkyu Choi
MQ
19
2
0
18 Mar 2021
Ps and Qs: Quantization-aware pruning for efficient low latency neural
  network inference
Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference
B. Hawks
Javier Mauricio Duarte
Nicholas J. Fraser
Alessandro Pappalardo
N. Tran
Yaman Umuroglu
MQ
8
51
0
22 Feb 2021
GradFreeBits: Gradient Free Bit Allocation for Dynamic Low Precision
  Neural Networks
GradFreeBits: Gradient Free Bit Allocation for Dynamic Low Precision Neural Networks
Ben Bodner
G. B. Shalom
Eran Treister
MQ
24
2
0
18 Feb 2021
Doping: A technique for efficient compression of LSTM models using
  sparse structured additive matrices
Doping: A technique for efficient compression of LSTM models using sparse structured additive matrices
Urmish Thakker
P. Whatmough
Zhi-Gang Liu
Matthew Mattina
Jesse G. Beu
16
6
0
14 Feb 2021
Confounding Tradeoffs for Neural Network Quantization
Confounding Tradeoffs for Neural Network Quantization
Sahaj Garg
Anirudh Jain
Joe Lou
Mitchell Nahmias
MQ
26
17
0
12 Feb 2021
Dynamic Precision Analog Computing for Neural Networks
Dynamic Precision Analog Computing for Neural Networks
Sahaj Garg
Joe Lou
Anirudh Jain
Mitchell Nahmias
45
33
0
12 Feb 2021
Single-path Bit Sharing for Automatic Loss-aware Model Compression
Single-path Bit Sharing for Automatic Loss-aware Model Compression
Jing Liu
Bohan Zhuang
Peng Chen
Chunhua Shen
Jianfei Cai
Mingkui Tan
MQ
15
7
0
13 Jan 2021
Previous
1234
Next