Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.09310
Cited By
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
18 October 2021
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention"
8 / 8 papers shown
Title
A Survey on Hardware Accelerators for Large Language Models
C. Kachris
31
14
0
18 Jan 2024
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms
Shikhar Tuli
N. Jha
34
3
0
24 Mar 2023
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
34
56
0
20 Sep 2022
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
290
979
0
27 Jan 2021
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
Haoyi Zhou
Shanghang Zhang
J. Peng
Shuai Zhang
Jianxin Li
Hui Xiong
Wan Zhang
AI4TS
169
3,876
0
14 Dec 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
243
579
0
12 Mar 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
227
575
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1