Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.00090
Cited By
Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search
30 November 2018
Bichen Wu
Yanghan Wang
Peizhao Zhang
Yuandong Tian
Peter Vajda
Kurt Keutzer
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search"
50 / 69 papers shown
Title
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Lianbo Ma
Jianlun Ma
Yuee Zhou
Guoyang Xie
Qiang He
Zhichao Lu
MQ
50
0
0
08 May 2025
Nearly Lossless Adaptive Bit Switching
Haiduo Huang
Zhenhua Liu
Tian Xia
Wenzhe zhao
Pengju Ren
MQ
68
0
0
03 Feb 2025
ARQ: A Mixed-Precision Quantization Framework for Accurate and Certifiably Robust DNNs
Yuchen Yang
Shubham Ugare
Yifan Zhao
Gagandeep Singh
Sasa Misailovic
MQ
38
0
0
31 Oct 2024
Channel-Wise Mixed-Precision Quantization for Large Language Models
Zihan Chen
Bike Xie
Jundong Li
Cong Shen
MQ
44
2
0
16 Oct 2024
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
Cédric Gernigon
Silviu-Ioan Filip
Olivier Sentieys
Clément Coggiola
Mickael Bruno
28
2
0
22 Apr 2024
Adaptive quantization with mixed-precision based on low-cost proxy
Jing Chen
Qiao Yang
Senmao Tian
Shunli Zhang
MQ
32
1
0
27 Feb 2024
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng Li
MQ
43
3
0
07 Aug 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
37
12
0
11 May 2023
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs
Javier Campos
Zhen Dong
Javier Mauricio Duarte
A. Gholami
Michael W. Mahoney
Jovan Mitrevski
Nhan Tran
MQ
37
3
0
13 Apr 2023
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks
Cheng Gong
Ye Lu
Surong Dai
Deng Qian
Chenkun Du
Tao Li
MQ
32
0
0
07 Apr 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
41
102
0
27 Feb 2023
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Deepika Bablani
J. McKinstry
S. K. Esser
R. Appuswamy
D. Modha
MQ
25
4
0
30 Jan 2023
Hyperspherical Quantization: Toward Smaller and More Accurate Models
Dan Liu
X. Chen
Chen Ma
Xue Liu
MQ
35
3
0
24 Dec 2022
CSMPQ:Class Separability Based Mixed-Precision Quantization
Ming-Yu Wang
Taisong Jin
Miaohui Zhang
Zhengtao Yu
MQ
33
0
0
20 Dec 2022
Make RepVGG Greater Again: A Quantization-aware Approach
Xiangxiang Chu
Liang Li
Bo Zhang
MQ
53
46
0
03 Dec 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey
Dalin Zhang
Kaixuan Chen
Yan Zhao
B. Yang
Li-Ping Yao
Christian S. Jensen
53
3
0
22 Aug 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
30
11
0
11 Aug 2022
Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA
Cecilia Latotzke
Tim Ciesielski
T. Gemmeke
MQ
13
8
0
09 Aug 2022
SDQ: Stochastic Differentiable Quantization with Mixed Precision
Xijie Huang
Zhiqiang Shen
Shichao Li
Zechun Liu
Xianghong Hu
Jeffry Wicaksana
Eric P. Xing
Kwang-Ting Cheng
MQ
34
33
0
09 Jun 2022
A Silicon Photonic Accelerator for Convolutional Neural Networks with Heterogeneous Quantization
Febin P. Sunny
Mahdi Nikdast
S. Pasricha
MQ
38
16
0
17 May 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
22
7
0
22 Mar 2022
Quantization in Layer's Input is Matter
Daning Cheng
Wenguang Chen
MQ
11
0
0
10 Feb 2022
Automatic Mixed-Precision Quantization Search of BERT
Changsheng Zhao
Ting Hua
Yilin Shen
Qian Lou
Hongxia Jin
MQ
25
19
0
30 Dec 2021
BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch
Souvik Kundu
Shikai Wang
Qirui Sun
Peter A. Beerel
Massoud Pedram
MQ
29
18
0
24 Dec 2021
Automated Deep Learning: Neural Architecture Search Is Not the End
Xuanyi Dong
D. Kedziora
Katarzyna Musial
Bogdan Gabrys
34
26
0
16 Dec 2021
N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores
Yu Gong
Zhihang Xu
Zhezhi He
Weifeng Zhang
Xiaobing Tu
Xiaoyao Liang
Li Jiang
33
13
0
15 Dec 2021
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation
Zechun Liu
Kwang-Ting Cheng
Dong Huang
Eric P. Xing
Zhiqiang Shen
MQ
25
103
0
29 Nov 2021
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
40
24
0
24 Nov 2021
Differentiable NAS Framework and Application to Ads CTR Prediction
Ravi Krishna
Aravind Kalaiah
Bichen Wu
Maxim Naumov
Dheevatsa Mudigere
M. Smelyanskiy
Kurt Keutzer
28
8
0
25 Oct 2021
BNAS v2: Learning Architectures for Binary Networks with Empirical Improvements
Dahyun Kim
Kunal Pratap Singh
Jonghyun Choi
MQ
49
7
0
16 Oct 2021
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization
Weihan Chen
Peisong Wang
Jian Cheng
MQ
49
62
0
13 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
22
89
0
30 Aug 2021
How Low Can We Go: Trading Memory for Error in Low-Precision Training
Chengrun Yang
Ziyang Wu
Jerry Chee
Christopher De Sa
Madeleine Udell
18
2
0
17 Jun 2021
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Yonggan Fu
Yongan Zhang
Yang Zhang
David D. Cox
Yingyan Lin
MQ
58
18
0
11 Jun 2021
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
Haoping Bai
Mengsi Cao
Ping Huang
Jiulong Shan
MQ
25
34
0
19 May 2021
HAO: Hardware-aware neural Architecture Optimization for Efficient Inference
Zhen Dong
Yizhao Gao
Qijing Huang
J. Wawrzynek
Hayden Kwok-Hay So
Kurt Keutzer
22
35
0
26 Apr 2021
InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks
Yonggan Fu
Zhongzhi Yu
Yongan Zhang
Yi Ding
Chaojian Li
Yongyuan Liang
Mingchao Jiang
Zhangyang Wang
Yingyan Lin
28
3
0
22 Apr 2021
Dynamic Precision Analog Computing for Neural Networks
Sahaj Garg
Joe Lou
Anirudh Jain
Mitchell Nahmias
45
33
0
12 Feb 2021
VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Steve Dai
Rangharajan Venkatesan
Haoxing Ren
B. Zimmer
W. Dally
Brucek Khailany
MQ
33
68
0
08 Feb 2021
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
Hadjer Benmeziane
Kaoutar El Maghraoui
Hamza Ouarnoughi
Smail Niar
Martin Wistuba
Naigang Wang
36
98
0
22 Jan 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
107
345
0
05 Jan 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
145
221
0
31 Dec 2020
Adaptive Precision Training for Resource Constrained Devices
Tian Huang
Yaoyu Zhang
Qiufeng Wang
38
5
0
23 Dec 2020
Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search
Mingzhu Shen
Feng Liang
Ruihao Gong
Yuhang Li
Chuming Li
Chen Lin
F. Yu
Junjie Yan
Wanli Ouyang
MQ
33
36
0
09 Oct 2020
Learned Low Precision Graph Neural Networks
Yiren Zhao
Duo Wang
Daniel Bates
Robert D. Mullins
M. Jamnik
Pietro Lio
GNN
39
34
0
19 Sep 2020
Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization
Haibao Yu
Qi Han
Jianbo Li
Jianping Shi
Guangliang Cheng
Bin Fan
MQ
21
61
0
20 Jul 2020
HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs
H. Habi
Roy H. Jennings
Arnon Netzer
MQ
29
65
0
20 Jul 2020
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
Tianzhe Wang
Kuan-Chieh Wang
Han Cai
Ji Lin
Zhijian Liu
Song Han
MQ
39
174
0
15 Jun 2020
Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors
C. Coelho
Aki Kuusela
Shane Li
Zhuang Hao
T. Aarrestad
Vladimir Loncar
J. Ngadiuba
M. Pierini
Adrian Alan Pol
S. Summers
MQ
42
178
0
15 Jun 2020
1
2
Next