ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
v1v2v3 (latest)

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
    MQ
ArXiv (abs)PDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 436 papers shown
Title
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance
Ian Colbert
Alessandro Pappalardo
Jakoba Petri-Koenig
MQ
111
9
0
25 Aug 2023
HyperSNN: A new efficient and robust deep learning model for resource
  constrained control applications
HyperSNN: A new efficient and robust deep learning model for resource constrained control applications
Zhanglu Yan
Shida Wang
Kaiwen Tang
Wong-Fai Wong
43
2
0
16 Aug 2023
Gradient-Based Post-Training Quantization: Challenging the Status Quo
Gradient-Based Post-Training Quantization: Challenging the Status Quo
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
MQ
89
0
0
15 Aug 2023
EQ-Net: Elastic Quantization Neural Networks
EQ-Net: Elastic Quantization Neural Networks
Ke Xu
Lei Han
Ye Tian
Shangshang Yang
Xingyi Zhang
MQ
124
10
0
15 Aug 2023
Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of
  Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation
Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation
Seyedarmin Azizi
M. Nazemi
A. Fayyazi
Massoud Pedram
MQ
57
5
0
12 Aug 2023
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust
  Neural Network Inference
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
Xavier Fischer
AAML
85
2
0
09 Aug 2023
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization
  Search
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng Li
MQ
95
3
0
07 Aug 2023
Dynamic Token-Pass Transformers for Semantic Segmentation
Dynamic Token-Pass Transformers for Semantic Segmentation
Yuang Liu
Qiang Zhou
Jing Wang
Fan Wang
Jun Wang
Wei Zhang
ViT
40
5
0
03 Aug 2023
To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation
To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation
Marc Botet Colomer
Pier Luigi Dovesi
Theodoros Panagiotakopoulos
J. Carvalho
Linus Harenstam-Nielsen
Hossein Azizpour
Hedvig Kjellström
Daniel Cremers
Matteo Poggi
TTA
93
9
0
27 Jul 2023
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution
  Networks
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks
Chee Hong
Kyoung Mu Lee
SupRMQ
50
1
0
25 Jul 2023
EMQ: Evolving Training-free Proxies for Automated Mixed Precision
  Quantization
EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
Peijie Dong
Lujun Li
Zimian Wei
Xin-Yi Niu
Zhiliang Tian
H. Pan
MQ
79
31
0
20 Jul 2023
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Vasileios Leon
Muhammad Abdullah Hanif
Giorgos Armeniakos
Xun Jiao
Mohamed Bennai
K. Pekmestzi
Dimitrios Soudris
104
3
0
20 Jul 2023
PLiNIO: A User-Friendly Library of Gradient-based Methods for
  Complexity-aware DNN Optimization
PLiNIO: A User-Friendly Library of Gradient-based Methods for Complexity-aware DNN Optimization
Daniele Jahier Pagliari
Matteo Risso
Beatrice Alessandra Motetti
Luca Bompani
71
9
0
18 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
125
73
0
16 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
49
5
0
10 Jul 2023
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN
  Inference
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference
Bahareh Khabbazan
Marc Riera
Antonio González
MQ
69
3
0
28 Jun 2023
Precision-aware Latency and Energy Balancing on Multi-Accelerator
  Platforms for DNN Inference
Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference
Matteo Risso
Luca Bompani
G. M. Sarda
Luca Benini
Enrico Macii
Massimo Poncino
Marian Verhelst
Daniele Jahier Pagliari
69
6
0
08 Jun 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision
  Post-Training Quantization
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
89
2
0
08 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDLMQ
193
587
0
01 Jun 2023
DynaShare: Task and Instance Conditioned Parameter Sharing for
  Multi-Task Learning
DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning
E. Rahimian
Golara Javadi
Frederick Tung
Gabriel L. Oliveira
MoE
76
3
0
26 May 2023
MixFormerV2: Efficient Fully Transformer Tracking
MixFormerV2: Efficient Fully Transformer Tracking
Yutao Cui
Tian-Shu Song
Gangshan Wu
Liming Wang
90
59
0
25 May 2023
PQA: Exploring the Potential of Product Quantization in DNN Hardware
  Acceleration
PQA: Exploring the Potential of Product Quantization in DNN Hardware Acceleration
Ahmed F. AbouElhamayed
Angela Cui
Javier Fernandez-Marques
Nicholas D. Lane
Mohamed S. Abdelfattah
MQ
78
6
0
25 May 2023
PDP: Parameter-free Differentiable Pruning is All You Need
PDP: Parameter-free Differentiable Pruning is All You Need
Minsik Cho
Saurabh N. Adya
Devang Naik
VLM
67
12
0
18 May 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer
Patch-wise Mixed-Precision Quantization of Vision Transformer
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
89
12
0
11 May 2023
LayerNAS: Neural Architecture Search in Polynomial Complexity
LayerNAS: Neural Architecture Search in Polynomial Complexity
Yicheng Fan
Dana Alon
Jingyue Shen
Daiyi Peng
Keshav Kumar
Yun Long
Xin Wang
Fotis Iliopoulos
Da-Cheng Juan
Erik Vee
72
2
0
23 Apr 2023
QuMoS: A Framework for Preserving Security of Quantum Machine Learning
  Model
QuMoS: A Framework for Preserving Security of Quantum Machine Learning Model
Zhepeng Wang
Jinyang Li
Zhirui Hu
Blake Gage
Elizabeth Iwasawa
Weiwen Jiang
97
11
0
23 Apr 2023
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Alexander Warnecke
Julian Speith
Janka Möller
Konrad Rieck
C. Paar
AAML
211
3
0
17 Apr 2023
Canvas: End-to-End Kernel Architecture Search in Neural Networks
Canvas: End-to-End Kernel Architecture Search in Neural Networks
Chenggang Zhao
Genghan Zhang
Mingyu Gao
56
1
0
16 Apr 2023
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs
  and ASICs
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs
Javier Campos
Zhen Dong
Javier Mauricio Duarte
A. Gholami
Michael W. Mahoney
Jovan Mitrevski
Nhan Tran
MQ
57
3
0
13 Apr 2023
Learning Accurate Performance Predictors for Ultrafast Automated Model
  Compression
Learning Accurate Performance Predictors for Ultrafast Automated Model Compression
Ziwei Wang
Jiwen Lu
Han Xiao
Shengyu Liu
Jie Zhou
OffRL
61
1
0
13 Apr 2023
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural
  Networks
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks
Cheng Gong
Ye Lu
Surong Dai
Deng Qian
Chenkun Du
Tao Li
MQ
57
0
0
07 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
105
43
0
07 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution
  Vision Transformer
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
211
48
0
30 Mar 2023
Solving Oscillation Problem in Post-Training Quantization Through a
  Theoretical Perspective
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma
Huixia Li
Xiawu Zheng
Xuefeng Xiao
Rui Wang
Shilei Wen
Xin Pan
Chia-Wen Lin
Rongrong Ji
MQ
64
12
0
21 Mar 2023
Gated Compression Layers for Efficient Always-On Models
Gated Compression Layers for Efficient Always-On Models
Haiguang Li
T. Thormundsson
I. Poupyrev
N. Gillian
76
2
0
15 Mar 2023
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8
  Inference
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Li Zhang
Xudong Wang
Jiahang Xu
Quanlu Zhang
Yujing Wang
Yuqing Yang
Ningxin Zheng
Ting Cao
Mao Yang
MQ
55
3
0
15 Mar 2023
R2 Loss: Range Restriction Loss for Model Compression and Quantization
R2 Loss: Range Restriction Loss for Model Compression and Quantization
Arnav Kundu
Chungkuk Yoo
Srijan Mishra
Minsik Cho
Saurabh N. Adya
MQ
65
1
0
14 Mar 2023
MetaMixer: A Regularization Strategy for Online Knowledge Distillation
MetaMixer: A Regularization Strategy for Online Knowledge Distillation
Maorong Wang
L. Xiao
T. Yamasaki
KELMMoE
41
1
0
14 Mar 2023
AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse
  Edge Environments
AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments
Hao Wen
Yuanchun Li
Zunshuai Zhang
Shiqi Jiang
Xiaozhou Ye
Ouyang Ye
Yaqin Zhang
Yunxin Liu
140
33
0
13 Mar 2023
Bag of Tricks with Quantized Convolutional Neural Networks for image
  classification
Bag of Tricks with Quantized Convolutional Neural Networks for image classification
Jie Hu
Mengze Zeng
Enhua Wu
MQ
57
2
0
13 Mar 2023
TinyAD: Memory-efficient anomaly detection for time series data in
  Industrial IoT
TinyAD: Memory-efficient anomaly detection for time series data in Industrial IoT
Yuting Sun
Tong Chen
Quoc Viet Hung Nguyen
Hongzhi Yin
92
13
0
07 Mar 2023
Rotation Invariant Quantization for Model Compression
Rotation Invariant Quantization for Model Compression
Dor-Joseph Kampeas
Yury Nahshan
Hanoch Kremer
Gil Lederman
Shira Zaloshinski
Zheng Li
E. Haleva
MQ
119
1
0
03 Mar 2023
Structured Pruning for Deep Convolutional Neural Networks: A survey
Structured Pruning for Deep Convolutional Neural Networks: A survey
Yang He
Lingao Xiao
3DPC
116
143
0
01 Mar 2023
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural
  Network Inference
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
Jiajun Zhou
Jiajun Wu
Yizhao Gao
Yuhao Ding
Chaofan Tao
Yue Liu
Fengbin Tu
Kwang-Ting Cheng
Hayden Kwok-Hay So
Ngai Wong
MQ
71
7
0
24 Feb 2023
Towards Optimal Compression: Joint Pruning and Quantization
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
81
3
0
15 Feb 2023
SEAM: Searching Transferable Mixed-Precision Quantization Policy through
  Large Margin Regularization
SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization
Chen Tang
Kai Ouyang
Zenghao Chai
Yunpeng Bai
Yuan Meng
Zhi Wang
Wenwu Zhu
MQ
79
9
0
14 Feb 2023
A Practical Mixed Precision Algorithm for Post-Training Quantization
A Practical Mixed Precision Algorithm for Post-Training Quantization
N. Pandey
Markus Nagel
M. V. Baalen
Yin-Ruey Huang
Chirag I. Patel
Tijmen Blankevoort
MQ
64
22
0
10 Feb 2023
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement
  Learning
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning
Yingchun Wang
Jingcai Guo
Song Guo
Weizhan Zhang
MQ
75
21
0
09 Feb 2023
DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a
  Multi-Tasking System
DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a Multi-Tasking System
Minkyoung Cho
Kang G. Shin
38
2
0
03 Feb 2023
Mixed Precision Post Training Quantization of Neural Networks with
  Sensitivity Guided Search
Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search
Clemens J. S. Schaefer
Elfie Guo
Caitlin Stanton
Xiaofan Zhang
T. Jablin
Navid Lambert-Shirzad
Jian Li
Chia-Wei Chou
Siddharth Joshi
Yu Wang
MQ
92
3
0
02 Feb 2023
Previous
123456789
Next