Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.15082
Cited By
Towards Efficient Post-training Quantization of Pre-trained Language Models
30 September 2021
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Efficient Post-training Quantization of Pre-trained Language Models"
13 / 13 papers shown
Title
A Bag of Tricks for Scaling CPU-based Deep FFMs to more than 300m Predictions per Second
Blaž Škrlj
Benjamin Ben-Shalom
Grega Gaspersic
Adi Schwartz
Ramzi Hoseisi
Naama Ziporin
Davorin Kopic
Andraz Tori
37
0
0
14 Jul 2024
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE
Florence Regol
Joud Chataoui
Bertrand Charpentier
Mark J. Coates
Pablo Piantanida
Stephan Gunnemann
45
0
0
20 Jun 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
43
0
0
31 May 2024
Herd: Using multiple, smaller LLMs to match the performances of proprietary, large LLMs via an intelligent composer
S. N. Hari
Matt Thomson
32
0
0
30 Oct 2023
Exploring Post-Training Quantization of Protein Language Models
Shuang Peng
Fei Yang
Ning Sun
Sheng Chen
Yanfeng Jiang
Aimin Pan
MQ
21
0
0
30 Oct 2023
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey
J. Zhong
Zheng Liu
Xiangshan Chen
ViT
44
17
0
21 Apr 2023
Towards Accurate Post-Training Quantization for Vision Transformer
Yifu Ding
Haotong Qin
Qing-Yu Yan
Z. Chai
Junjie Liu
Xiaolin K. Wei
Xianglong Liu
MQ
54
68
0
25 Mar 2023
Rediscovering Hashed Random Projections for Efficient Quantization of Contextualized Sentence Embeddings
Ulf A. Hamster
Ji-Ung Lee
Alexander Geyken
Iryna Gurevych
21
0
0
13 Mar 2023
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
142
221
0
31 Dec 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1