Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.08886
Cited By
v1
v2
v3 (latest)
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"HAQ: Hardware-Aware Automated Quantization with Mixed Precision"
50 / 436 papers shown
Title
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
MQ
111
9
0
25 Aug 2023
HyperSNN: A new efficient and robust deep learning model for resource constrained control applications
Zhanglu Yan
Shida Wang
Kaiwen Tang
Wong-Fai Wong
43
2
0
16 Aug 2023
Gradient-Based Post-Training Quantization: Challenging the Status Quo
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
MQ
89
0
0
15 Aug 2023
EQ-Net: Elastic Quantization Neural Networks
Ke Xu
Lei Han
Ye Tian
Shangshang Yang
Xingyi Zhang
MQ
124
10
0
15 Aug 2023
Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation
Seyedarmin Azizi
M. Nazemi
A. Fayyazi
Massoud Pedram
MQ
57
5
0
12 Aug 2023
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
Xavier Fischer
AAML
85
2
0
09 Aug 2023
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng Li
MQ
95
3
0
07 Aug 2023
Dynamic Token-Pass Transformers for Semantic Segmentation
Yuang Liu
Qiang Zhou
Jing Wang
Fan Wang
Jun Wang
Wei Zhang
ViT
40
5
0
03 Aug 2023
To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation
Marc Botet Colomer
Pier Luigi Dovesi
Theodoros Panagiotakopoulos
J. Carvalho
Linus Harenstam-Nielsen
Hossein Azizpour
Hedvig Kjellström
Daniel Cremers
Matteo Poggi
TTA
93
9
0
27 Jul 2023
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks
Chee Hong
Kyoung Mu Lee
SupR
MQ
50
1
0
25 Jul 2023
EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
Peijie Dong
Lujun Li
Zimian Wei
Xin-Yi Niu
Zhiliang Tian
H. Pan
MQ
79
31
0
20 Jul 2023
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Vasileios Leon
Muhammad Abdullah Hanif
Giorgos Armeniakos
Xun Jiao
Mohamed Bennai
K. Pekmestzi
Dimitrios Soudris
104
3
0
20 Jul 2023
PLiNIO: A User-Friendly Library of Gradient-based Methods for Complexity-aware DNN Optimization
Daniele Jahier Pagliari
Matteo Risso
Beatrice Alessandra Motetti
Luca Bompani
71
9
0
18 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
125
73
0
16 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
49
5
0
10 Jul 2023
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference
Bahareh Khabbazan
Marc Riera
Antonio González
MQ
69
3
0
28 Jun 2023
Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference
Matteo Risso
Luca Bompani
G. M. Sarda
Luca Benini
Enrico Macii
Massimo Poncino
Marian Verhelst
Daniele Jahier Pagliari
69
6
0
08 Jun 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
89
2
0
08 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
193
587
0
01 Jun 2023
DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning
E. Rahimian
Golara Javadi
Frederick Tung
Gabriel L. Oliveira
MoE
76
3
0
26 May 2023
MixFormerV2: Efficient Fully Transformer Tracking
Yutao Cui
Tian-Shu Song
Gangshan Wu
Liming Wang
90
59
0
25 May 2023
PQA: Exploring the Potential of Product Quantization in DNN Hardware Acceleration
Ahmed F. AbouElhamayed
Angela Cui
Javier Fernandez-Marques
Nicholas D. Lane
Mohamed S. Abdelfattah
MQ
78
6
0
25 May 2023
PDP: Parameter-free Differentiable Pruning is All You Need
Minsik Cho
Saurabh N. Adya
Devang Naik
VLM
67
12
0
18 May 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
89
12
0
11 May 2023
LayerNAS: Neural Architecture Search in Polynomial Complexity
Yicheng Fan
Dana Alon
Jingyue Shen
Daiyi Peng
Keshav Kumar
Yun Long
Xin Wang
Fotis Iliopoulos
Da-Cheng Juan
Erik Vee
72
2
0
23 Apr 2023
QuMoS: A Framework for Preserving Security of Quantum Machine Learning Model
Zhepeng Wang
Jinyang Li
Zhirui Hu
Blake Gage
Elizabeth Iwasawa
Weiwen Jiang
97
11
0
23 Apr 2023
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Alexander Warnecke
Julian Speith
Janka Möller
Konrad Rieck
C. Paar
AAML
211
3
0
17 Apr 2023
Canvas: End-to-End Kernel Architecture Search in Neural Networks
Chenggang Zhao
Genghan Zhang
Mingyu Gao
56
1
0
16 Apr 2023
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs
Javier Campos
Zhen Dong
Javier Mauricio Duarte
A. Gholami
Michael W. Mahoney
Jovan Mitrevski
Nhan Tran
MQ
57
3
0
13 Apr 2023
Learning Accurate Performance Predictors for Ultrafast Automated Model Compression
Ziwei Wang
Jiwen Lu
Han Xiao
Shengyu Liu
Jie Zhou
OffRL
61
1
0
13 Apr 2023
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks
Cheng Gong
Ye Lu
Surong Dai
Deng Qian
Chenkun Du
Tao Li
MQ
57
0
0
07 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
105
43
0
07 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
211
48
0
30 Mar 2023
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma
Huixia Li
Xiawu Zheng
Xuefeng Xiao
Rui Wang
Shilei Wen
Xin Pan
Chia-Wen Lin
Rongrong Ji
MQ
64
12
0
21 Mar 2023
Gated Compression Layers for Efficient Always-On Models
Haiguang Li
T. Thormundsson
I. Poupyrev
N. Gillian
76
2
0
15 Mar 2023
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Li Zhang
Xudong Wang
Jiahang Xu
Quanlu Zhang
Yujing Wang
Yuqing Yang
Ningxin Zheng
Ting Cao
Mao Yang
MQ
55
3
0
15 Mar 2023
R2 Loss: Range Restriction Loss for Model Compression and Quantization
Arnav Kundu
Chungkuk Yoo
Srijan Mishra
Minsik Cho
Saurabh N. Adya
MQ
65
1
0
14 Mar 2023
MetaMixer: A Regularization Strategy for Online Knowledge Distillation
Maorong Wang
L. Xiao
T. Yamasaki
KELM
MoE
41
1
0
14 Mar 2023
AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments
Hao Wen
Yuanchun Li
Zunshuai Zhang
Shiqi Jiang
Xiaozhou Ye
Ouyang Ye
Yaqin Zhang
Yunxin Liu
140
33
0
13 Mar 2023
Bag of Tricks with Quantized Convolutional Neural Networks for image classification
Jie Hu
Mengze Zeng
Enhua Wu
MQ
57
2
0
13 Mar 2023
TinyAD: Memory-efficient anomaly detection for time series data in Industrial IoT
Yuting Sun
Tong Chen
Quoc Viet Hung Nguyen
Hongzhi Yin
92
13
0
07 Mar 2023
Rotation Invariant Quantization for Model Compression
Dor-Joseph Kampeas
Yury Nahshan
Hanoch Kremer
Gil Lederman
Shira Zaloshinski
Zheng Li
E. Haleva
MQ
119
1
0
03 Mar 2023
Structured Pruning for Deep Convolutional Neural Networks: A survey
Yang He
Lingao Xiao
3DPC
116
143
0
01 Mar 2023
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
Jiajun Zhou
Jiajun Wu
Yizhao Gao
Yuhao Ding
Chaofan Tao
Yue Liu
Fengbin Tu
Kwang-Ting Cheng
Hayden Kwok-Hay So
Ngai Wong
MQ
71
7
0
24 Feb 2023
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
81
3
0
15 Feb 2023
SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization
Chen Tang
Kai Ouyang
Zenghao Chai
Yunpeng Bai
Yuan Meng
Zhi Wang
Wenwu Zhu
MQ
79
9
0
14 Feb 2023
A Practical Mixed Precision Algorithm for Post-Training Quantization
N. Pandey
Markus Nagel
M. V. Baalen
Yin-Ruey Huang
Chirag I. Patel
Tijmen Blankevoort
MQ
64
22
0
10 Feb 2023
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning
Yingchun Wang
Jingcai Guo
Song Guo
Weizhan Zhang
MQ
75
21
0
09 Feb 2023
DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a Multi-Tasking System
Minkyoung Cho
Kang G. Shin
38
2
0
03 Feb 2023
Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search
Clemens J. S. Schaefer
Elfie Guo
Caitlin Stanton
Xiaofan Zhang
T. Jablin
Navid Lambert-Shirzad
Jian Li
Chia-Wei Chou
Siddharth Joshi
Yu Wang
MQ
92
3
0
02 Feb 2023
Previous
1
2
3
4
5
6
7
8
9
Next