ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.18437
100
0
v1v2 (latest)

Mixed Non-linear Quantization for Vision Transformers

26 July 2024
Gihwan Kim
Jemin Lee
Sihyeong Park
Yongin Kwon
Hyungshin Kim
    MQ
ArXiv (abs)PDFHTML
Main:13 Pages
6 Figures
Bibliography:3 Pages
6 Tables
Abstract

The majority of quantization methods have been proposed to reduce the model size of Vision Transformers, yet most of them have overlooked the quantization of non-linear operations. Only a few works have addressed quantization for non-linear operations, but they applied a single quantization method across all non-linear operations. We believe that this can be further improved by employing a different quantization method for each non-linear operation. Therefore, to assign the most error-minimizing quantization method from the known methods to each non-linear layer, we propose a mixed non-linear quantization that considers layer-wise quantization sensitivity measured by SQNR difference metric. The results show that our method outperforms I-BERT, FQ-ViT, and I-ViT in both 8-bit and 6-bit settings for ViT, DeiT, and Swin models by an average of 0.6%p and 19.6%p, respectively. Our method outperforms I-BERT and I-ViT by 0.6%p and 20.8%p, respectively, when training time is limited. We plan to release our code at https://gitlab.com/ones-ai/mixed-non-linear-quantization.

View on arXiv
@article{kim2025_2407.18437,
  title={ Mixed Non-linear Quantization for Vision Transformers },
  author={ Gihwan Kim and Jemin Lee and Sihyeong Park and Yongin Kwon and Hyungshin Kim },
  journal={arXiv preprint arXiv:2407.18437},
  year={ 2025 }
}
Comments on this paper