Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.03493
Cited By
ITA: An Energy-Efficient Attention and Softmax Accelerator for Quantized Transformers
7 July 2023
Gamze Islamoglu
Moritz Scherer
G. Paulin
Tim Fischer
Victor J. B. Jung
Angelo Garofalo
Luca Benini
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ITA: An Energy-Efficient Attention and Softmax Accelerator for Quantized Transformers"
8 / 8 papers shown
Title
VEXP: A Low-Cost RISC-V ISA Extension for Accelerated Softmax Computation in Transformers
Run Wang
Gamze Islamoglu
Andrea Belano
Viviane Potocnik
Francesco Conti
Angelo Garofalo
Luca Benini
26
0
0
15 Apr 2025
EXAQ: Exponent Aware Quantization For LLMs Acceleration
Moran Shkolnik
Maxim Fishman
Brian Chmiel
Hilla Ben-Yaacov
Ron Banner
Kfir Y. Levy
MQ
21
0
0
04 Oct 2024
Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow
Philip Wiese
Gamze İslamoğlu
Moritz Scherer
Luka Macan
Victor J. B. Jung
Alessio Burrello
Francesco Conti
Luca Benini
29
0
0
05 Aug 2024
Reusing Softmax Hardware Unit for GELU Computation in Transformers
C. Peltekis
K. Alexandridis
G. Dimitrakopoulos
27
0
0
15 Feb 2024
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
Yuhao Ji
Chao Fang
Zhongfeng Wang
30
3
0
22 Jan 2024
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO
Julian Moosmann
Pietro Bonazzi
Yawei Li
Sizhen Bian
Philipp Mayer
Luca Benini
Michele Magno
28
12
0
02 Nov 2023
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,982
0
09 Feb 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
102
341
0
05 Jan 2021
1