ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.08034
  4. Cited By
Towards Fully 8-bit Integer Inference for the Transformer Model

Towards Fully 8-bit Integer Inference for the Transformer Model

17 September 2020
Ye Lin
Yanyang Li
Tengbo Liu
Tong Xiao
Tongran Liu
Jingbo Zhu
    MQ
ArXivPDFHTML

Papers citing "Towards Fully 8-bit Integer Inference for the Transformer Model"

14 / 14 papers shown
Title
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit
  Large Language Models
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Xing Hu
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
40
8
0
28 May 2024
Enabling High-Sparsity Foundational Llama Models with Efficient
  Pretraining and Deployment
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
Abhinav Agarwalla
Abhay Gupta
Alexandre Marques
Shubhra Pandit
Michael Goin
...
Tuan Nguyen
Mahmoud Salem
Dan Alistarh
Sean Lie
Mark Kurtz
MoE
SyDa
45
11
0
06 May 2024
BitCoin: Bidirectional Tagging and Supervised Contrastive Learning based
  Joint Relational Triple Extraction Framework
BitCoin: Bidirectional Tagging and Supervised Contrastive Learning based Joint Relational Triple Extraction Framework
Luyao He
Zhongbao Zhang
Sen Su
Yuxin Chen
24
0
0
21 Sep 2023
Transformer-based models and hardware acceleration analysis in
  autonomous driving: A survey
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey
J. Zhong
Zheng Liu
Xiangshan Chen
ViT
48
17
0
21 Apr 2023
SwiftTron: An Efficient Hardware Accelerator for Quantized Transformers
SwiftTron: An Efficient Hardware Accelerator for Quantized Transformers
Alberto Marchisio
David Durà
Maurizio Capra
Maurizio Martina
Guido Masera
Mohamed Bennai
36
20
0
08 Apr 2023
AMD-HookNet for Glacier Front Segmentation
AMD-HookNet for Glacier Front Segmentation
Fei Wu
Nora Gourmelon
T. Seehaus
Jianlin Zhang
M. Braun
Andreas Maier
Vincent Christlein
24
9
0
06 Feb 2023
EIT: Enhanced Interactive Transformer
EIT: Enhanced Interactive Transformer
Tong Zheng
Bei Li
Huiwen Bao
Tong Xiao
Jingbo Zhu
32
2
0
20 Dec 2022
The RoyalFlush System for the WMT 2022 Efficiency Task
The RoyalFlush System for the WMT 2022 Efficiency Task
Bo Qin
Aixin Jia
Qiang Wang
Jian Lu
Shuqin Pan
Haibo Wang
Ming-Tso Chen
49
1
0
03 Dec 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer
  Inference
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
57
96
0
04 Jul 2022
SimA: Simple Softmax-free Attention for Vision Transformers
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
26
25
0
17 Jun 2022
The NiuTrans System for WNGT 2020 Efficiency Task
The NiuTrans System for WNGT 2020 Efficiency Task
Chi Hu
Bei Li
Ye Lin
Yinqiao Li
Yanyang Li
Chenglong Wang
Tong Xiao
Jingbo Zhu
25
7
0
16 Sep 2021
The NiuTrans System for the WMT21 Efficiency Task
The NiuTrans System for the WMT21 Efficiency Task
Chenglong Wang
Chi Hu
Yongyu Mu
Zhongxiang Yan
Siming Wu
...
Hang Cao
Bei Li
Ye Lin
Tong Xiao
Jingbo Zhu
29
2
0
16 Sep 2021
VOGUE: Answer Verbalization through Multi-Task Learning
VOGUE: Answer Verbalization through Multi-Task Learning
Endri Kacupaj
Shyamnath Premnadh
Kuldeep Singh
Jens Lehmann
M. Maleshkova
18
7
0
24 Jun 2021
An Efficient Transformer Decoder with Compressed Sub-layers
An Efficient Transformer Decoder with Compressed Sub-layers
Yanyang Li
Ye Lin
Tong Xiao
Jingbo Zhu
33
29
0
03 Jan 2021
1