Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.14794
Cited By
Rethinking Attention with Performers
30 September 2020
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
Tamás Sarlós
Peter Hawkins
Jared Davis
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Attention with Performers"
50 / 1,016 papers shown
Title
Improving Social Media Popularity Prediction with Multiple Post Dependencies
Zhizhen Zhang
Xiao-Zhu Xie
Meng Yang
Ye Tian
Yong-jia Jiang
Yong Cui
29
5
0
28 Jul 2023
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
Zhen Qin
Dong Li
Weigao Sun
Weixuan Sun
Xuyang Shen
...
Yunshen Wei
Baohong Lv
Xiao Luo
Yu Qiao
Yiran Zhong
43
15
0
27 Jul 2023
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Tri Dao
LRM
32
1,140
0
17 Jul 2023
Fast Quantum Algorithm for Attention Computation
Yeqi Gao
Zhao Song
Xin Yang
Ruizhe Zhang
LRM
31
20
0
16 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
45
62
0
16 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
54
19
0
13 Jul 2023
In-context Autoencoder for Context Compression in a Large Language Model
Tao Ge
Jing Hu
Lei Wang
Xun Wang
Si-Qing Chen
Furu Wei
RALM
40
68
0
13 Jul 2023
Unsupervised 3D out-of-distribution detection with latent diffusion models
M. Graham
W. H. Pinaya
P. Wright
Petru-Daniel Tudosiu
Y. Mah
...
H. Jäger
D. Werring
P. Nachev
Sebastien Ourselin
M. Jorge Cardoso
DiffM
MedIm
25
9
0
07 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
41
151
0
05 Jul 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
30
8
0
05 Jul 2023
Sumformer: Universal Approximation for Efficient Transformers
Silas Alberti
Niclas Dern
L. Thesing
Gitta Kutyniok
27
16
0
05 Jul 2023
Compound Attention and Neighbor Matching Network for Multi-contrast MRI Super-resolution
Wenxuan Chen
Sirui Wu
Shuai Wang
Zhong Li
Jia Yang
Huifeng Yao
Xiao-quan Song
SupR
MedIm
35
1
0
05 Jul 2023
Make A Long Image Short: Adaptive Token Length for Vision Transformers
Yuqin Zhu
Yichen Zhu
ViT
72
17
0
05 Jul 2023
Spike-driven Transformer
Man Yao
Jiakui Hu
Zhaokun Zhou
Liuliang Yuan
Yonghong Tian
Boxing Xu
Guoqi Li
34
118
0
04 Jul 2023
Challenges in Domain-Specific Abstractive Summarization and How to Overcome them
Anum Afzal
Juraj Vladika
Daniel Braun
Florian Matthes
HILM
30
10
0
03 Jul 2023
SMILE: Evaluation and Domain Adaptation for Social Media Language Understanding
Vasilisa Bashlovkina
Riley Matthews
Zhaobin Kuang
Simon Baumgartner
Michael Bendersky
40
4
0
30 Jun 2023
Transformers in Healthcare: A Survey
Subhash Nerella
S. Bandyopadhyay
Jiaqing Zhang
Miguel Contreras
Scott Siegel
...
Jessica Sena
B. Shickel
A. Bihorac
Kia Khezeli
Parisa Rashidi
MedIm
AI4CE
21
25
0
30 Jun 2023
FLuRKA: Fast and accurate unified Low-Rank & Kernel Attention
Ahan Gupta
Hao Guo
Yueming Yuan
Yan-Quan Zhou
Charith Mendis
21
2
0
27 Jun 2023
Extending Context Window of Large Language Models via Positional Interpolation
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
17
494
0
27 Jun 2023
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
Daya Guo
Canwen Xu
Nan Duan
Jian Yin
Julian McAuley
20
78
0
26 Jun 2023
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
61
255
0
24 Jun 2023
Efficient Online Processing with Deep Neural Networks
Lukas Hedegaard
26
0
0
23 Jun 2023
Iterated Piecewise Affine (IPA) Approximation for Language Modeling
Davood Shamsi
Wenhui Hua
Brian Williams
13
0
0
21 Jun 2023
Training Transformers with 4-bit Integers
Haocheng Xi
Changhao Li
Jianfei Chen
Jun Zhu
MQ
25
47
0
21 Jun 2023
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren
Yang Liu
Shuohang Wang
Yichong Xu
Chenguang Zhu
Chengxiang Zhai
53
13
0
19 Jun 2023
Block-State Transformers
Mahan Fathi
Jonathan Pilault
Orhan Firat
C. Pal
Pierre-Luc Bacon
Ross Goroshin
42
17
0
15 Jun 2023
Recurrent Action Transformer with Memory
A. Staroverov
A. Bessonov
Dmitry A. Yudin
A. Kovalev
Aleksandr I. Panov
OffRL
41
4
0
15 Jun 2023
Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking
Björn Bebensee
Haejun Lee
31
4
0
15 Jun 2023
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification
Qitian Wu
Wentao Zhao
Zenan Li
David Wipf
Junchi Yan
27
209
0
14 Jun 2023
Revisiting Token Pruning for Object Detection and Instance Segmentation
Yifei Liu
Mathias Gehrig
Nico Messikommer
Marco Cannici
Davide Scaramuzza
ViT
VLM
42
25
0
12 Jun 2023
Recurrent Attention Networks for Long-text Modeling
Xianming Li
Zongxi Li
Xiaotian Luo
Haoran Xie
Xing Lee
Yingbin Zhao
Fu Lee Wang
Qing Li
RALM
38
15
0
12 Jun 2023
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Haoran You
Huihong Shi
Yipin Guo
Yingyan Lin
Lin
34
16
0
10 Jun 2023
S
3
^{3}
3
: Increasing GPU Utilization during Generative Inference for Higher Throughput
Yunho Jin
Chun-Feng Wu
David Brooks
Gu-Yeon Wei
29
62
0
09 Jun 2023
Using Sequences of Life-events to Predict Human Lives
Germans Savcisens
Tina Eliassi-Rad
L. K. Hansen
L. Mortensen
Lau Lilleholt
Anna Rogers
Ingo Zettler
Sune Lehmann
AI4TS
44
36
0
05 Jun 2023
ESTISR: Adapting Efficient Scene Text Image Super-resolution for Real-Scenes
Minghao Fu
Xin Man
Yihan Xu
Jie Shao
33
2
0
04 Jun 2023
RITA: Group Attention is All You Need for Timeseries Analytics
Jiaming Liang
Lei Cao
Samuel Madden
Z. Ives
Guoliang Li
AI4TS
18
0
0
02 Jun 2023
The Information Pathways Hypothesis: Transformers are Dynamic Self-Ensembles
Md Shamim Hussain
Mohammed J Zaki
D. Subramanian
39
3
0
02 Jun 2023
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Matteo Pagliardini
Daniele Paliotta
Martin Jaggi
Franccois Fleuret
LRM
20
22
0
01 Jun 2023
Auto-Spikformer: Spikformer Architecture Search
Kaiwei Che
Zhaokun Zhou
Zhengyu Ma
Wei Fang
Yanqing Chen
Shuaijie Shen
Liuliang Yuan
Yonghong Tian
29
4
0
01 Jun 2023
Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation
Yingyi Chen
Qinghua Tao
F. Tonin
Johan A. K. Suykens
42
19
0
31 May 2023
Recasting Self-Attention with Holographic Reduced Representations
Mohammad Mahmudul Alam
Edward Raff
Stella Biderman
Tim Oates
James Holt
8
8
0
31 May 2023
Blockwise Parallel Transformer for Large Context Models
Hao Liu
Pieter Abbeel
49
11
0
30 May 2023
HiGen: Hierarchical Graph Generative Networks
Mahdi Karami
39
4
0
30 May 2023
Taylorformer: Probabilistic Predictions for Time Series and other Processes
Omer Nivron
R. Parthipan
Damon J. Wischik
BDL
AI4TS
21
2
0
30 May 2023
GAN-MPC: Training Model Predictive Controllers with Parameterized Cost Functions using Demonstrations from Non-identical Experts
Returaj Burnwal
Anirban Santara
Nirav P. Bhatt
Balaraman Ravindran
Gaurav Aggarwal
19
0
0
30 May 2023
Brainformers: Trading Simplicity for Efficiency
Yan-Quan Zhou
Nan Du
Yanping Huang
Daiyi Peng
Chang Lan
...
Zhifeng Chen
Quoc V. Le
Claire Cui
J.H.J. Laundon
J. Dean
MoE
18
23
0
29 May 2023
A Quantitative Review on Language Model Efficiency Research
Meng Jiang
Hy Dang
Lingbo Tong
27
0
0
28 May 2023
Scalable Transformer for PDE Surrogate Modeling
Zijie Li
Dule Shu
A. Farimani
35
67
0
27 May 2023
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Jinqi Xiao
Miao Yin
Yu Gong
Xiao Zang
Jian Ren
Bo Yuan
VLM
ViT
43
9
0
26 May 2023
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Zichang Liu
Aditya Desai
Fangshuo Liao
Weitao Wang
Victor Xie
Zhaozhuo Xu
Anastasios Kyrillidis
Anshumali Shrivastava
28
202
0
26 May 2023
Previous
1
2
3
...
8
9
10
...
19
20
21
Next