ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.19591
  4. Cited By
Genetic Quantization-Aware Approximation for Non-Linear Operations in
  Transformers

Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

28 March 2024
Pingcheng Dong
Yonghao Tan
Dong Zhang
Tianwei Ni
Xuejiao Liu
Yu Liu
Peng Luo
Luhong Liang
Shih-yang Liu
Xijie Huang
Huaiyu Zhu
Yun Pan
Fengwei An
Kwang-Ting Cheng
    MQ
ArXivPDFHTML

Papers citing "Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers"

2 / 2 papers shown
Title
Hydra Attention: Efficient Attention with Many Heads
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
99
76
0
15 Sep 2022
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
99
341
0
05 Jan 2021
1