Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.12659
Cited By
v1
v2 (latest)
A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization
24 July 2023
Edward Fish
Umberto Michieli
Mete Ozay
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization"
19 / 19 papers shown
Title
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
103
106
0
04 Jul 2022
Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
Kai Zhen
Hieu Duy Nguyen
Ravi Chinta
Nathan Susanj
Athanasios Mouchtaris
Tariq Afzal
Ariya Rastrow
MQ
57
12
0
30 Jun 2022
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
A. Fasoli
Chia-Yu Chen
Mauricio Serrano
Swagath Venkataramani
G. Saon
Xiaodong Cui
Brian Kingsbury
K. Gopalakrishnan
MQ
48
6
0
16 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLM
MQ
122
479
0
04 Jun 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
138
322
0
25 May 2022
4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Shaojin Ding
Phoenix Meadowlark
Yanzhang He
Lukasz Lew
Shivani Agrawal
Oleg Rybakov
MQ
50
35
0
29 Mar 2022
Automatic Mixed-Precision Quantization Search of BERT
Changsheng Zhao
Ting Hua
Yilin Shen
Qian Lou
Hongxia Jin
MQ
43
21
0
30 Dec 2021
FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Yang Lin
Tianyu Zhang
Peiqin Sun
Zheng Li
Shuchang Zhou
ViT
MQ
72
156
0
27 Nov 2021
Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples
Kanghyun Choi
Deokki Hong
Noseong Park
Youngsok Kim
Jinho Lee
MQ
59
65
0
04 Nov 2021
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
101
339
0
27 Jun 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
208
697
0
24 Jan 2021
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing Jin
Cai Liang
Tiancheng Wu
Li Zou
Zhiliang Gan
MQ
51
27
0
15 Jan 2021
EasyQuant: Post-training Quantization via Scale Optimization
Di Wu
Qingming Tang
Yongle Zhao
Ming Zhang
Ying Fu
Debing Zhang
MQ
71
78
0
30 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
285
5,801
0
20 Jun 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
88
585
0
22 Apr 2020
ZeroQ: A Novel Zero Shot Quantization Framework
Yaohui Cai
Z. Yao
Zhen Dong
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
88
397
0
01 Jan 2020
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
91
280
0
10 Nov 2019
A Simplified Fully Quantized Transformer for End-to-end Speech Recognition
Alex Bie
Bharat Venkitesh
João Monteiro
Md. Akmal Haidar
Mehdi Rezagholizadeh
MQ
65
27
0
09 Nov 2019
Data-Free Quantization Through Weight Equalization and Bias Correction
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
Max Welling
MQ
75
513
0
11 Jun 2019
1