Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.08034
Cited By
Towards Fully 8-bit Integer Inference for the Transformer Model
17 September 2020
Ye Lin
Yanyang Li
Tengbo Liu
Tong Xiao
Tongran Liu
Jingbo Zhu
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Fully 8-bit Integer Inference for the Transformer Model"
14 / 14 papers shown
Title
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Xing Hu
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
40
8
0
28 May 2024
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
Abhinav Agarwalla
Abhay Gupta
Alexandre Marques
Shubhra Pandit
Michael Goin
...
Tuan Nguyen
Mahmoud Salem
Dan Alistarh
Sean Lie
Mark Kurtz
MoE
SyDa
45
11
0
06 May 2024
BitCoin: Bidirectional Tagging and Supervised Contrastive Learning based Joint Relational Triple Extraction Framework
Luyao He
Zhongbao Zhang
Sen Su
Yuxin Chen
24
0
0
21 Sep 2023
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey
J. Zhong
Zheng Liu
Xiangshan Chen
ViT
48
17
0
21 Apr 2023
SwiftTron: An Efficient Hardware Accelerator for Quantized Transformers
Alberto Marchisio
David Durà
Maurizio Capra
Maurizio Martina
Guido Masera
Mohamed Bennai
36
20
0
08 Apr 2023
AMD-HookNet for Glacier Front Segmentation
Fei Wu
Nora Gourmelon
T. Seehaus
Jianlin Zhang
M. Braun
Andreas Maier
Vincent Christlein
24
9
0
06 Feb 2023
EIT: Enhanced Interactive Transformer
Tong Zheng
Bei Li
Huiwen Bao
Tong Xiao
Jingbo Zhu
32
2
0
20 Dec 2022
The RoyalFlush System for the WMT 2022 Efficiency Task
Bo Qin
Aixin Jia
Qiang Wang
Jian Lu
Shuqin Pan
Haibo Wang
Ming-Tso Chen
49
1
0
03 Dec 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
57
96
0
04 Jul 2022
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
26
25
0
17 Jun 2022
The NiuTrans System for WNGT 2020 Efficiency Task
Chi Hu
Bei Li
Ye Lin
Yinqiao Li
Yanyang Li
Chenglong Wang
Tong Xiao
Jingbo Zhu
25
7
0
16 Sep 2021
The NiuTrans System for the WMT21 Efficiency Task
Chenglong Wang
Chi Hu
Yongyu Mu
Zhongxiang Yan
Siming Wu
...
Hang Cao
Bei Li
Ye Lin
Tong Xiao
Jingbo Zhu
29
2
0
16 Sep 2021
VOGUE: Answer Verbalization through Multi-Task Learning
Endri Kacupaj
Shyamnath Premnadh
Kuldeep Singh
Jens Lehmann
M. Maleshkova
18
7
0
24 Jun 2021
An Efficient Transformer Decoder with Compressed Sub-layers
Yanyang Li
Ye Lin
Tong Xiao
Jingbo Zhu
33
29
0
03 Jan 2021
1