ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
v1v2v3 (latest)

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
    MQ
ArXiv (abs)PDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 436 papers shown
Title
Understanding and Overcoming the Challenges of Efficient Transformer
  Quantization
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
83
146
0
27 Sep 2021
Distribution-sensitive Information Retention for Accurate Binary Neural
  Network
Distribution-sensitive Information Retention for Accurate Binary Neural Network
Haotong Qin
Xiangguo Zhang
Ruihao Gong
Yifu Ding
Yi Xu
Xianglong Liu
MQ
71
95
0
25 Sep 2021
Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning
Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning
Hanwei Fan
Jiandong Mu
W. Zhang
73
5
0
22 Sep 2021
OMPQ: Orthogonal Mixed Precision Quantization
OMPQ: Orthogonal Mixed Precision Quantization
Yuexiao Ma
Taisong Jin
Xiawu Zheng
Yan Wang
Huixia Li
Yongjian Wu
Guannan Jiang
Wei Zhang
Rongrong Ji
MQ
116
37
0
16 Sep 2021
Elastic Significant Bit Quantization and Acceleration for Deep Neural
  Networks
Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks
Cheng Gong
Ye Lu
Kunpeng Xie
Zongming Jin
Tao Li
Yanzhi Wang
MQ
62
7
0
08 Sep 2021
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing
  Deep Neural Networks for Wearables
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables
B. Prabakaran
Asima Akhtar
Semeen Rehman
Osman Hasan
Mohamed Bennai
23
10
0
07 Sep 2021
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
  Quantization Loss
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss
J. H. Lee
Jihun Yun
Sung Ju Hwang
Eunho Yang
MQ
92
15
0
05 Sep 2021
Architecture Aware Latency Constrained Sparse Neural Networks
Architecture Aware Latency Constrained Sparse Neural Networks
Tianli Zhao
Qinghao Hu
Xiangyu He
Weixiang Xu
Jiaxing Wang
Cong Leng
Jian Cheng
67
0
0
01 Sep 2021
Efficient Visual Recognition with Deep Neural Networks: A Survey on
  Recent Advances and New Directions
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
Yang Wu
Dingheng Wang
Xiaotong Lu
Fan Yang
Guoqi Li
W. Dong
Jianbo Shi
104
18
0
30 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
79
92
0
30 Aug 2021
DKM: Differentiable K-Means Clustering Layer for Neural Network
  Compression
DKM: Differentiable K-Means Clustering Layer for Neural Network Compression
Minsik Cho
Keivan Alizadeh Vahid
Saurabh N. Adya
Mohammad Rastegari
95
34
0
28 Aug 2021
Dynamic Network Quantization for Efficient Video Inference
Dynamic Network Quantization for Efficient Video Inference
Ximeng Sun
Yikang Shen
Chun-Fu Chen
A. Oliva
Rogerio Feris
Kate Saenko
89
46
0
23 Aug 2021
On the Acceleration of Deep Neural Network Inference using Quantized
  Compressed Sensing
On the Acceleration of Deep Neural Network Inference using Quantized Compressed Sensing
Meshia Cédric Oveneke
MQ
49
0
0
23 Aug 2021
Online Multi-Granularity Distillation for GAN Compression
Online Multi-Granularity Distillation for GAN Compression
Yuxi Ren
Jie Wu
Xuefeng Xiao
Jianchao Yang
104
40
0
16 Aug 2021
Generalizable Mixed-Precision Quantization via Attribution Rank
  Preservation
Generalizable Mixed-Precision Quantization via Attribution Rank Preservation
Ziwei Wang
Han Xiao
Jiwen Lu
Jie Zhou
MQ
74
32
0
05 Aug 2021
MOHAQ: Multi-Objective Hardware-Aware Quantization of Recurrent Neural
  Networks
MOHAQ: Multi-Objective Hardware-Aware Quantization of Recurrent Neural Networks
Nesma M. Rezk
Tomas Nordstrom
D. Stathis
Z. Ul-Abdin
E. Aksoy
A. Hemani
MQ
46
1
0
02 Aug 2021
Pruning Ternary Quantization
Danyang Liu
Xiangshan Chen
Jie Fu
Chen Ma
Xue Liu
MQ
62
0
0
23 Jul 2021
LANA: Latency Aware Network Acceleration
LANA: Latency Aware Network Acceleration
Pavlo Molchanov
Jimmy Hall
Hongxu Yin
Jan Kautz
Nicolò Fusi
Arash Vahdat
139
10
0
12 Jul 2021
HEMP: High-order Entropy Minimization for neural network comPression
HEMP: High-order Entropy Minimization for neural network comPression
Enzo Tartaglione
Stéphane Lathuilière
Attilio Fiandrotti
Marco Cagnazzo
Marco Grangetto
MQ
46
7
0
12 Jul 2021
Post-Training Quantization for Vision Transformer
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViTMQ
112
346
0
27 Jun 2021
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU
  Tensor Cores
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores
Boyuan Feng
Yuke Wang
Tong Geng
Ang Li
Yufei Ding
MQ
74
37
0
23 Jun 2021
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision
  Quantization
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization
Santiago Miret
Vui Seng Chua
Mattias Marder
Mariano Phielipp
Nilesh Jain
Somdeb Majumdar
38
8
0
14 Jun 2021
Sparse PointPillars: Maintaining and Exploiting Input Sparsity to
  Improve Runtime on Embedded Systems
Sparse PointPillars: Maintaining and Exploiting Input Sparsity to Improve Runtime on Embedded Systems
Kyle Vedder
Eric Eaton
3DPC
50
13
0
12 Jun 2021
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Yonggan Fu
Yongan Zhang
Yang Zhang
David D. Cox
Yingyan Lin
MQ
125
18
0
11 Jun 2021
DynamicViT: Efficient Vision Transformers with Dynamic Token
  Sparsification
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
ViT
143
711
0
03 Jun 2021
RED : Looking for Redundancies for Data-Free Structured Compression of
  Deep Neural Networks
RED : Looking for Redundancies for Data-Free Structured Compression of Deep Neural Networks
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
CVBM
73
24
0
31 May 2021
NAAS: Neural Accelerator Architecture Search
NAAS: Neural Accelerator Architecture Search
Chengyue Wu
Mengtian Yang
Song Han
84
60
0
27 May 2021
Low-Precision Hardware Architectures Meet Recommendation Model Inference
  at Scale
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale
Zhaoxia Deng
Deng
Jongsoo Park
P. T. P. Tang
Haixin Liu
...
S. Nadathur
Changkyu Kim
Maxim Naumov
S. Naghshineh
M. Smelyanskiy
59
11
0
26 May 2021
DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural
  Networks for Edge Vision Applications
DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural Networks for Edge Vision Applications
Yaoyu Zhang
Wai Teng Tang
Matthew Kay Fei Lee
Chuping Qu
Weng-Fai Wong
Rick Siow Mong Goh
61
0
0
25 May 2021
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
Haoping Bai
Mengsi Cao
Ping Huang
Jiulong Shan
MQ
83
34
0
19 May 2021
Pareto-Optimal Quantized ResNet Is Mostly 4-bit
Pareto-Optimal Quantized ResNet Is Mostly 4-bit
AmirAli Abdolrashidi
Lisa Wang
Shivani Agrawal
J. Malmaud
Oleg Rybakov
Chas Leichner
Lukasz Lew
MQ
71
36
0
07 May 2021
On the Adversarial Robustness of Quantized Neural Networks
On the Adversarial Robustness of Quantized Neural Networks
Micah Gorsline
James T. Smith
Cory E. Merkel
AAML
85
19
0
01 May 2021
HAO: Hardware-aware neural Architecture Optimization for Efficient
  Inference
HAO: Hardware-aware neural Architecture Optimization for Efficient Inference
Zhen Dong
Yizhao Gao
Qijing Huang
J. Wawrzynek
Hayden Kwok-Hay So
Kurt Keutzer
79
37
0
26 Apr 2021
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural
  Networks
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural Networks
Sayeed Shafayet Chowdhury
Isha Garg
Kaushik Roy
67
40
0
26 Apr 2021
InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks
InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks
Yonggan Fu
Zhongzhi Yu
Yongan Zhang
Yi Ding
Chaojian Li
Yongyuan Liang
Mingchao Jiang
Zhangyang Wang
Yingyan Lin
92
3
0
22 Apr 2021
Differentiable Model Compression via Pseudo Quantization Noise
Differentiable Model Compression via Pseudo Quantization Noise
Alexandre Défossez
Yossi Adi
Gabriel Synnaeve
DiffMMQ
92
50
0
20 Apr 2021
Coarse-to-Fine Searching for Efficient Generative Adversarial Networks
Coarse-to-Fine Searching for Efficient Generative Adversarial Networks
Jiahao Wang
Han Shu
Weihao Xia
Yujiu Yang
Yunhe Wang
GAN
75
5
0
19 Apr 2021
TENT: Efficient Quantization of Neural Networks on the tiny Edge with
  Tapered FixEd PoiNT
TENT: Efficient Quantization of Neural Networks on the tiny Edge with Tapered FixEd PoiNT
H. F. Langroudi
Vedant Karia
Tej Pandit
Dhireesha Kudithipudi
MQ
49
10
0
06 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Hervé Jégou
Matthijs Douze
ViT
103
796
0
02 Apr 2021
Network Quantization with Element-wise Gradient Scaling
Network Quantization with Element-wise Gradient Scaling
Junghyup Lee
Dohyung Kim
Bumsub Ham
MQ
91
120
0
02 Apr 2021
Training Multi-bit Quantized and Binarized Networks with A Learnable
  Symmetric Quantizer
Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric Quantizer
Phuoc Pham
J. Abraham
Jaeyong Chung
MQ
81
13
0
01 Apr 2021
Bit-Mixer: Mixed-precision networks with runtime bit-width selection
Bit-Mixer: Mixed-precision networks with runtime bit-width selection
Adrian Bulat
Georgios Tzimiropoulos
MQ
77
27
0
31 Mar 2021
RCT: Resource Constrained Training for Edge AI
RCT: Resource Constrained Training for Edge AI
Tian Huang
Yaoyu Zhang
Ming Yan
Qiufeng Wang
Rick Siow Mong Goh
82
8
0
26 Mar 2021
n-hot: Efficient bit-level sparsity for powers-of-two neural network
  quantization
n-hot: Efficient bit-level sparsity for powers-of-two neural network quantization
Yuiko Sakuma
Hiroshi Sumihiro
Jun Nishikawa
Toshiki Nakamura
Ryoji Ikegaya
MQ
81
1
0
22 Mar 2021
Data-free mixed-precision quantization using novel sensitivity metric
Data-free mixed-precision quantization using novel sensitivity metric
Donghyun Lee
M. Cho
Seungwon Lee
Joonho Song
Changkyu Choi
MQ
68
2
0
18 Mar 2021
Environmental Sound Classification on the Edge: A Pipeline for Deep
  Acoustic Networks on Extremely Resource-Constrained Devices
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained Devices
Md Mohaimenuzzaman
Christoph Bergmeir
I. West
B. Meyer
150
43
0
05 Mar 2021
Anycost GANs for Interactive Image Synthesis and Editing
Anycost GANs for Interactive Image Synthesis and Editing
Ji Lin
Richard Y. Zhang
F. Ganz
Song Han
Jun-Yan Zhu
140
86
0
04 Mar 2021
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space
  Search
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search
Kartik Hegde
Po-An Tsai
Sitao Huang
Vikas Chandra
A. Parashar
Christopher W. Fletcher
72
97
0
02 Mar 2021
Improved Techniques for Quantizing Deep Networks with Adaptive
  Bit-Widths
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths
Ximeng Sun
Yikang Shen
Chun-Fu Chen
Naigang Wang
Bowen Pan
Bowen Pan Kailash Gopalakrishnan
A. Oliva
Rogerio Feris
Kate Saenko
MQ
76
4
0
02 Mar 2021
FjORD: Fair and Accurate Federated Learning under heterogeneous targets
  with Ordered Dropout
FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout
Samuel Horváth
Stefanos Laskaridis
Mario Almeida
Ilias Leondiadis
Stylianos I. Venieris
Nicholas D. Lane
304
275
0
26 Feb 2021
Previous
123456789
Next