Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.08886
Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
21 November 2018
Kuan-Chieh Jackson Wang
Zhijian Liu
Yujun Lin
Ji Lin
Song Han
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HAQ: Hardware-Aware Automated Quantization with Mixed Precision"
50 / 435 papers shown
Title
EQ-Net: Elastic Quantization Neural Networks
Ke Xu
Lei Han
Ye Tian
Shangshang Yang
Xingyi Zhang
MQ
43
7
0
15 Aug 2023
Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation
Seyedarmin Azizi
M. Nazemi
A. Fayyazi
Massoud Pedram
MQ
27
5
0
12 Aug 2023
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
Xavier Fischer
AAML
10
2
0
09 Aug 2023
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng Li
MQ
33
3
0
07 Aug 2023
Dynamic Token-Pass Transformers for Semantic Segmentation
Yuang Liu
Qiang Zhou
Jing Wang
Fan Wang
Jun Wang
Wei Zhang
ViT
28
5
0
03 Aug 2023
To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation
Marc Botet Colomer
Pier Luigi Dovesi
Theodoros Panagiotakopoulos
J. Carvalho
Linus Harenstam-Nielsen
Hossein Azizpour
Hedvig Kjellström
Daniel Cremers
Matteo Poggi
TTA
30
9
0
27 Jul 2023
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks
Chee Hong
Kyoung Mu Lee
SupR
MQ
27
1
0
25 Jul 2023
EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
Peijie Dong
Lujun Li
Zimian Wei
Xin-Yi Niu
Zhiliang Tian
H. Pan
MQ
51
28
0
20 Jul 2023
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Vasileios Leon
Muhammad Abdullah Hanif
Giorgos Armeniakos
Xun Jiao
Muhammad Shafique
K. Pekmestzi
Dimitrios Soudris
37
3
0
20 Jul 2023
PLiNIO: A User-Friendly Library of Gradient-based Methods for Complexity-aware DNN Optimization
Daniele Jahier Pagliari
Matteo Risso
Beatrice Alessandra Motetti
Alessio Burrello
21
8
0
18 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
45
62
0
16 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
27
5
0
10 Jul 2023
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference
Bahareh Khabbazan
Marc Riera
Antonio González
MQ
16
3
0
28 Jun 2023
Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference
Matteo Risso
Alessio Burrello
G. M. Sarda
Luca Benini
Enrico Macii
M. Poncino
Marian Verhelst
Daniele Jahier Pagliari
28
4
0
08 Jun 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
30
2
0
08 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
36
470
0
01 Jun 2023
DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning
E. Rahimian
Golara Javadi
Frederick Tung
Gabriel L. Oliveira
MoE
24
2
0
26 May 2023
MixFormerV2: Efficient Fully Transformer Tracking
Yutao Cui
Tian-Shu Song
Gangshan Wu
Liming Wang
29
54
0
25 May 2023
PQA: Exploring the Potential of Product Quantization in DNN Hardware Acceleration
Ahmed F. AbouElhamayed
Angela Cui
Javier Fernandez-Marques
Nicholas D. Lane
Mohamed S. Abdelfattah
MQ
26
4
0
25 May 2023
PDP: Parameter-free Differentiable Pruning is All You Need
Minsik Cho
Saurabh N. Adya
Devang Naik
VLM
17
10
0
18 May 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
32
12
0
11 May 2023
LayerNAS: Neural Architecture Search in Polynomial Complexity
Yicheng Fan
Dana Alon
Jingyue Shen
Daiyi Peng
Keshav Kumar
Yun Long
Xin Wang
Fotis Iliopoulos
Da-Cheng Juan
Erik Vee
31
2
0
23 Apr 2023
QuMoS: A Framework for Preserving Security of Quantum Machine Learning Model
Zhepeng Wang
Jinyang Li
Zhirui Hu
Blake Gage
Elizabeth Iwasawa
Weiwen Jiang
33
9
0
23 Apr 2023
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Alexander Warnecke
Julian Speith
Janka Möller
Konrad Rieck
C. Paar
AAML
16
3
0
17 Apr 2023
Canvas: End-to-End Kernel Architecture Search in Neural Networks
Chenggang Zhao
Genghan Zhang
Mingyu Gao
20
1
0
16 Apr 2023
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs
Javier Campos
Zhen Dong
Javier Mauricio Duarte
A. Gholami
Michael W. Mahoney
Jovan Mitrevski
Nhan Tran
MQ
32
3
0
13 Apr 2023
Learning Accurate Performance Predictors for Ultrafast Automated Model Compression
Ziwei Wang
Jiwen Lu
Han Xiao
Shengyu Liu
Jie Zhou
OffRL
29
1
0
13 Apr 2023
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks
Cheng Gong
Ye Lu
Surong Dai
Deng Qian
Chenkun Du
Tao Li
MQ
29
0
0
07 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
29
46
0
30 Mar 2023
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma
Huixia Li
Xiawu Zheng
Xuefeng Xiao
Rui Wang
Shilei Wen
Xin Pan
Rongrong Ji
Rongrong Ji
MQ
21
12
0
21 Mar 2023
Gated Compression Layers for Efficient Always-On Models
Haiguang Li
T. Thormundsson
I. Poupyrev
N. Gillian
44
2
0
15 Mar 2023
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Li Zhang
Xudong Wang
Jiahang Xu
Quanlu Zhang
Yujing Wang
Yuqing Yang
Ningxin Zheng
Ting Cao
Mao Yang
MQ
38
2
0
15 Mar 2023
R2 Loss: Range Restriction Loss for Model Compression and Quantization
Arnav Kundu
Chungkuk Yoo
Srijan Mishra
Minsik Cho
Saurabh N. Adya
MQ
33
1
0
14 Mar 2023
MetaMixer: A Regularization Strategy for Online Knowledge Distillation
Maorong Wang
L. Xiao
T. Yamasaki
KELM
MoE
29
1
0
14 Mar 2023
AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments
Hao Wen
Yuanchun Li
Zunshuai Zhang
Shiqi Jiang
Xiaozhou Ye
Ouyang Ye
Yaqin Zhang
Yunxin Liu
90
29
0
13 Mar 2023
Bag of Tricks with Quantized Convolutional Neural Networks for image classification
Jie Hu
Mengze Zeng
Enhua Wu
MQ
23
2
0
13 Mar 2023
TinyAD: Memory-efficient anomaly detection for time series data in Industrial IoT
Yuting Sun
Tong Chen
Quoc Viet Hung Nguyen
Hongzhi Yin
23
12
0
07 Mar 2023
Rotation Invariant Quantization for Model Compression
Dor-Joseph Kampeas
Yury Nahshan
Hanoch Kremer
Gil Lederman
Shira Zaloshinski
Zheng Li
E. Haleva
MQ
23
0
0
03 Mar 2023
Structured Pruning for Deep Convolutional Neural Networks: A survey
Yang He
Lingao Xiao
3DPC
30
117
0
01 Mar 2023
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
Jiajun Zhou
Jiajun Wu
Yizhao Gao
Yuhao Ding
Chaofan Tao
Bo-wen Li
Fengbin Tu
Kwang-Ting Cheng
Hayden Kwok-Hay So
Ngai Wong
MQ
29
7
0
24 Feb 2023
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
22
2
0
15 Feb 2023
SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization
Chen Tang
Kai Ouyang
Zenghao Chai
Yunpeng Bai
Yuan Meng
Zhi Wang
Wenwu Zhu
MQ
32
9
0
14 Feb 2023
A Practical Mixed Precision Algorithm for Post-Training Quantization
N. Pandey
Markus Nagel
M. V. Baalen
Yin-Ruey Huang
Chirag I. Patel
Tijmen Blankevoort
MQ
16
19
0
10 Feb 2023
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning
Yingchun Wang
Jingcai Guo
Song Guo
Weizhan Zhang
MQ
31
20
0
09 Feb 2023
DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a Multi-Tasking System
Minkyoung Cho
Kang G. Shin
29
2
0
03 Feb 2023
Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search
Clemens J. S. Schaefer
Elfie Guo
Caitlin Stanton
Xiaofan Zhang
T. Jablin
Navid Lambert-Shirzad
Jian Li
Chia-Wei Chou
Siddharth Joshi
Yu Wang
MQ
31
3
0
02 Feb 2023
A
2
Q
\rm A^2Q
A
2
Q
: Aggregation-Aware Quantization for Graph Neural Networks
Zeyu Zhu
Fanrong Li
Zitao Mo
Qinghao Hu
Gang Li
Zejian Liu
Xiaoyao Liang
Jian Cheng
GNN
MQ
34
4
0
01 Feb 2023
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Deepika Bablani
J. McKinstry
S. K. Esser
R. Appuswamy
D. Modha
MQ
23
4
0
30 Jan 2023
Does Federated Learning Really Need Backpropagation?
H. Feng
Tianyu Pang
Chao Du
Wei Chen
Shuicheng Yan
Min-Bin Lin
FedML
36
10
0
28 Jan 2023
Previous
1
2
3
4
5
6
7
8
9
Next