Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.19591
Cited By
Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers
28 March 2024
Pingcheng Dong
Yonghao Tan
Dong Zhang
Tianwei Ni
Xuejiao Liu
Yu Liu
Peng Luo
Luhong Liang
Shih-yang Liu
Xijie Huang
Huaiyu Zhu
Yun Pan
Fengwei An
Kwang-Ting Cheng
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers"
2 / 2 papers shown
Title
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
99
76
0
15 Sep 2022
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
99
341
0
05 Jan 2021
1