Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.15508
Cited By
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
22 July 2024
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners"
25 / 25 papers shown
Title
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
312
4,253
0
09 Jun 2023
Quadapter: Adapter for GPT-2 Quantization
Minseop Park
J. You
Markus Nagel
Simyung Chang
MQ
51
9
0
30 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
347
2,377
0
09 Nov 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
68
151
0
27 Sep 2022
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
76
233
0
24 Aug 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
63
174
0
11 Mar 2022
VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks
Yi-Lin Sung
Jaemin Cho
Joey Tianyi Zhou
VLM
VPVLM
72
351
0
13 Dec 2021
A White Paper on Neural Network Quantization
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
MQ
59
532
0
15 Jun 2021
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction
Yuhang Li
Ruihao Gong
Xu Tan
Yang Yang
Peng Hu
Qi Zhang
F. Yu
Wei Wang
Shi Gu
MQ
102
433
0
10 Feb 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
137
351
0
05 Jan 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
432
2,081
0
31 Dec 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Itay Hubara
Yury Nahshan
Y. Hanani
Ron Banner
Daniel Soudry
MQ
86
125
0
14 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
185
1,694
0
08 Jun 2020
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Ali Hadi Zadeh
Isak Edo
Omar Mohamed Awad
Andreas Moshovos
MQ
53
188
0
08 May 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
61
575
0
22 Apr 2020
DynaBERT: Dynamic BERT with Adaptive Width and Depth
Lu Hou
Zhiqi Huang
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
MQ
73
322
0
08 Apr 2020
Quantization Networks
Jiwei Yang
Xu Shen
Jun Xing
Xinmei Tian
Houqiang Li
Bing Deng
Jianqiang Huang
Xiansheng Hua
MQ
68
342
0
21 Nov 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
205
1,511
0
24 May 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
226
2,305
0
02 May 2019
Learned Step Size Quantization
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
MQ
69
798
0
21 Feb 2019
CoQA: A Conversational Question Answering Challenge
Siva Reddy
Danqi Chen
Christopher D. Manning
RALM
HAI
98
1,199
0
21 Aug 2018
Scalable Methods for 8-bit Training of Neural Networks
Ron Banner
Itay Hubara
Elad Hoffer
Daniel Soudry
MQ
84
337
0
25 May 2018
PACT: Parameterized Clipping Activation for Quantized Neural Networks
Jungwook Choi
Zhuo Wang
Swagath Venkataramani
P. Chuang
Vijayalakshmi Srinivasan
K. Gopalakrishnan
MQ
58
947
0
16 May 2018
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
136
3,111
0
15 Dec 2017
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
258
2,842
0
26 Sep 2016
1