ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
v1v2v3 (latest)

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
    MQ
ArXiv (abs)PDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 436 papers shown
Title
Ps and Qs: Quantization-aware pruning for efficient low latency neural
  network inference
Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference
B. Hawks
Javier Mauricio Duarte
Nicholas J. Fraser
Alessandro Pappalardo
N. Tran
Yaman Umuroglu
MQ
74
51
0
22 Feb 2021
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network
  Quantization
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization
Huanrui Yang
Lin Duan
Yiran Chen
Hai Helen Li
MQ
83
65
0
20 Feb 2021
GradFreeBits: Gradient Free Bit Allocation for Dynamic Low Precision
  Neural Networks
GradFreeBits: Gradient Free Bit Allocation for Dynamic Low Precision Neural Networks
Ben Bodner
G. B. Shalom
Eran Treister
MQ
36
2
0
18 Feb 2021
An Information-Theoretic Justification for Model Pruning
An Information-Theoretic Justification for Model Pruning
Berivan Isik
Tsachy Weissman
Albert No
164
37
0
16 Feb 2021
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware
  Transformation
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware Transformation
Chaofan Tao
Rui Lin
Quan Chen
Zhaoyang Zhang
Ping Luo
Ngai Wong
MQ
63
7
0
15 Feb 2021
Confounding Tradeoffs for Neural Network Quantization
Confounding Tradeoffs for Neural Network Quantization
Sahaj Garg
Anirudh Jain
Joe Lou
Mitchell Nahmias
MQ
76
19
0
12 Feb 2021
Dynamic Precision Analog Computing for Neural Networks
Dynamic Precision Analog Computing for Neural Networks
Sahaj Garg
Joe Lou
Anirudh Jain
Mitchell Nahmias
75
33
0
12 Feb 2021
BRECQ: Pushing the Limit of Post-Training Quantization by Block
  Reconstruction
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction
Yuhang Li
Ruihao Gong
Xu Tan
Yang Yang
Peng Hu
Qi Zhang
F. Yu
Wei Wang
Shi Gu
MQ
181
445
0
10 Feb 2021
AHAR: Adaptive CNN for Energy-efficient Human Activity Recognition in
  Low-power Edge Devices
AHAR: Adaptive CNN for Energy-efficient Human Activity Recognition in Low-power Edge Devices
Nafiul Rashid
B. U. Demirel
Mohammad Abdullah Al Faruque
115
80
0
03 Feb 2021
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
Hamzah Abdel-Aziz
Ali Shafiee
J. Shin
A. Pedram
Joseph Hassoun
MQ
72
11
0
27 Jan 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
250
708
0
24 Jan 2021
Network Pruning using Adaptive Exemplar Filters
Network Pruning using Adaptive Exemplar Filters
Mingbao Lin
Rongrong Ji
Shaojie Li
Yan Wang
Yongjian Wu
Feiyue Huang
QiXiang Ye
VLM
70
55
0
20 Jan 2021
Multi-Task Network Pruning and Embedded Optimization for Real-time
  Deployment in ADAS
Multi-Task Network Pruning and Embedded Optimization for Real-time Deployment in ADAS
F. Dellinger
T. Boulay
Diego Mendoza Barrenechea
Said El-Hachimi
Isabelle Leang
Fabian Burger
24
2
0
19 Jan 2021
Single-path Bit Sharing for Automatic Loss-aware Model Compression
Single-path Bit Sharing for Automatic Loss-aware Model Compression
Jing Liu
Bohan Zhuang
Peng Chen
Chunhua Shen
Jianfei Cai
Mingkui Tan
MQ
58
8
0
13 Jan 2021
SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
  Training
SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training
Xiaohan Chen
Yang Zhao
Yue Wang
Pengfei Xu
Haoran You
Chaojian Li
Y. Fu
Yingyan Lin
Zhangyang Wang
107
1
0
04 Jan 2021
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
230
227
0
31 Dec 2020
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training
Y. Fu
Haoran You
Yang Zhao
Yue Wang
Chaojian Li
K. Gopalakrishnan
Zhangyang Wang
Yingyan Lin
MQ
84
32
0
24 Dec 2020
Adaptive Precision Training for Resource Constrained Devices
Adaptive Precision Training for Resource Constrained Devices
Tian Huang
Yaoyu Zhang
Qiufeng Wang
65
5
0
23 Dec 2020
Hardware and Software Optimizations for Accelerating Deep Neural
  Networks: Survey of Current Trends, Challenges, and the Road Ahead
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
Maurizio Capra
Beatrice Bussolino
Alberto Marchisio
Guido Masera
Maurizio Martina
Mohamed Bennai
BDL
129
147
0
21 Dec 2020
Machine Learning Systems in the IoT: Trustworthiness Trade-offs for Edge
  Intelligence
Machine Learning Systems in the IoT: Trustworthiness Trade-offs for Edge Intelligence
Wiebke Toussaint
Aaron Yi Ding
86
12
0
01 Dec 2020
Ax-BxP: Approximate Blocked Computation for Precision-Reconfigurable
  Deep Neural Network Acceleration
Ax-BxP: Approximate Blocked Computation for Precision-Reconfigurable Deep Neural Network Acceleration
Reena Elangovan
Shubham Jain
A. Raghunathan
19
7
0
25 Nov 2020
Bringing AI To Edge: From Deep Learning's Perspective
Bringing AI To Edge: From Deep Learning's Perspective
Di Liu
Hao Kong
Xiangzhong Luo
Weichen Liu
Ravi Subramaniam
116
124
0
25 Nov 2020
Larq Compute Engine: Design, Benchmark, and Deploy State-of-the-Art
  Binarized Neural Networks
Larq Compute Engine: Design, Benchmark, and Deploy State-of-the-Art Binarized Neural Networks
T. Bannink
Arash Bakhtiari
Adam Hillier
Lukas Geiger
T. D. Bruin
Leon Overweel
J. Neeven
K. Helwegen
3DVMQ
64
37
0
18 Nov 2020
Irregularly Tabulated MLP for Fast Point Feature Embedding
Irregularly Tabulated MLP for Fast Point Feature Embedding
Yusuke Sekikawa
Teppei Suzuki
75
2
0
13 Nov 2020
Automated Model Compression by Jointly Applied Pruning and Quantization
Automated Model Compression by Jointly Applied Pruning and Quantization
Wenting Tang
Xingxing Wei
Yue Liu
MQ
22
7
0
12 Nov 2020
Resource-Aware Pareto-Optimal Automated Machine Learning Platform
Resource-Aware Pareto-Optimal Automated Machine Learning Platform
Yao Yang
Andrew Nam
M. Nasr-Azadani
Teresa Tung
66
6
0
30 Oct 2020
Permute, Quantize, and Fine-tune: Efficient Compression of Neural
  Networks
Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks
Julieta Martinez
Jashan Shewakramani
Ting Liu
Ioan Andrei Bârsan
Wenyuan Zeng
R. Urtasun
MQ
94
31
0
29 Oct 2020
An Investigation on Different Underlying Quantization Schemes for
  Pre-trained Language Models
An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models
Zihan Zhao
Yuncong Liu
Lu Chen
Qi Liu
Rao Ma
Kai Yu
MQ
46
12
0
14 Oct 2020
Once Quantization-Aware Training: High Performance Extremely Low-bit
  Architecture Search
Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search
Mingzhu Shen
Feng Liang
Ruihao Gong
Yuhang Li
Chuming Li
Chen Lin
F. Yu
Junjie Yan
Wanli Ouyang
MQ
82
39
0
09 Oct 2020
Online Knowledge Distillation via Multi-branch Diversity Enhancement
Online Knowledge Distillation via Multi-branch Diversity Enhancement
Zheng Li
Ying Huang
Defang Chen
Tianren Luo
Ning Cai
Zhigeng Pan
69
28
0
02 Oct 2020
MSP: An FPGA-Specific Mixed-Scheme, Multi-Precision Deep Neural Network
  Quantization Framework
MSP: An FPGA-Specific Mixed-Scheme, Multi-Precision Deep Neural Network Quantization Framework
Sung-En Chang
Yanyu Li
Mengshu Sun
Weiwen Jiang
Runbin Shi
Xue Lin
Yanzhi Wang
MQ
46
7
0
16 Sep 2020
FleXOR: Trainable Fractional Quantization
FleXOR: Trainable Fractional Quantization
Dongsoo Lee
S. Kwon
Byeongwook Kim
Yongkweon Jeon
Baeseong Park
Jeongin Yun
MQ
52
13
0
09 Sep 2020
Layer-specific Optimization for Mixed Data Flow with Mixed Precision in
  FPGA Design for CNN-based Object Detectors
Layer-specific Optimization for Mixed Data Flow with Mixed Precision in FPGA Design for CNN-based Object Detectors
Duy-Thanh Nguyen
Hyun Kim
Hyuk-Jae Lee
MQ
44
63
0
03 Sep 2020
Transform Quantization for CNN (Convolutional Neural Network)
  Compression
Transform Quantization for CNN (Convolutional Neural Network) Compression
Sean I. Young
Wang Zhe
David S. Taubman
B. Girod
MQ
119
72
0
02 Sep 2020
GAN Slimming: All-in-One GAN Compression by A Unified Optimization
  Framework
GAN Slimming: All-in-One GAN Compression by A Unified Optimization Framework
Haotao Wang
Shupeng Gui
Haichuan Yang
Ji Liu
Zhangyang Wang
110
82
0
25 Aug 2020
Matching Guided Distillation
Matching Guided Distillation
Kaiyu Yue
Jiangfan Deng
Feng Zhou
53
50
0
23 Aug 2020
One Weight Bitwidth to Rule Them All
One Weight Bitwidth to Rule Them All
Ting-Wu Chin
P. Chuang
Vikas Chandra
Diana Marculescu
MQ
67
25
0
22 Aug 2020
Channel-wise Hessian Aware trace-Weighted Quantization of Neural
  Networks
Channel-wise Hessian Aware trace-Weighted Quantization of Neural Networks
Xu Qian
Victor Li
Darren Crews
MQ
51
9
0
19 Aug 2020
Leveraging Automated Mixed-Low-Precision Quantization for tiny edge
  microcontrollers
Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers
Manuele Rusci
Marco Fariselli
Alessandro Capotondi
Luca Benini
MQ
65
17
0
12 Aug 2020
Degree-Quant: Quantization-Aware Training for Graph Neural Networks
Degree-Quant: Quantization-Aware Training for Graph Neural Networks
Shyam A. Tailor
Javier Fernandez-Marques
Nicholas D. Lane
GNNMQ
73
145
0
11 Aug 2020
HAPI: Hardware-Aware Progressive Inference
HAPI: Hardware-Aware Progressive Inference
Stefanos Laskaridis
Stylianos I. Venieris
Hyeji Kim
Nicholas D. Lane
85
47
0
10 Aug 2020
Modeling Data Reuse in Deep Neural Networks by Taking Data-Types into
  Cognizance
Modeling Data Reuse in Deep Neural Networks by Taking Data-Types into Cognizance
N. Jha
Sparsh Mittal
48
16
0
06 Aug 2020
Continuous-in-Depth Neural Networks
Continuous-in-Depth Neural Networks
A. Queiruga
N. Benjamin Erichson
D. Taylor
Michael W. Mahoney
109
48
0
05 Aug 2020
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Haotian Tang
Zhijian Liu
Shengyu Zhao
Chengyue Wu
Ji Lin
Hanrui Wang
Song Han
3DPC
151
649
0
31 Jul 2020
WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic
WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic
Renkun Ni
Hong-Min Chu
Oscar Castañeda
Ping Yeh-Chiang
Christoph Studer
Tom Goldstein
MQ
56
14
0
26 Jul 2020
Differentiable Joint Pruning and Quantization for Hardware Efficiency
Differentiable Joint Pruning and Quantization for Hardware Efficiency
Ying Wang
Yadong Lu
Tijmen Blankevoort
MQ
95
72
0
20 Jul 2020
MCUNet: Tiny Deep Learning on IoT Devices
MCUNet: Tiny Deep Learning on IoT Devices
Ji Lin
Wei-Ming Chen
Chengyue Wu
J. Cohn
Chuang Gan
Song Han
168
497
0
20 Jul 2020
Search What You Want: Barrier Panelty NAS for Mixed Precision
  Quantization
Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization
Haibao Yu
Qi Han
Jianbo Li
Jianping Shi
Guangliang Cheng
Bin Fan
MQ
81
61
0
20 Jul 2020
HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs
HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs
H. Habi
Roy H. Jennings
Arnon Netzer
MQ
72
66
0
20 Jul 2020
DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural
  Networks
DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks
Hassan Dbouk
Hetul Sanghvi
M. Mehendale
Naresh R Shanbhag
MQ
51
9
0
19 Jul 2020
Previous
123456789
Next