Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.14794
Cited By
Rethinking Attention with Performers
30 September 2020
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
Tamás Sarlós
Peter Hawkins
Jared Davis
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Attention with Performers"
50 / 1,019 papers shown
Title
Transformer Acceleration with Dynamic Sparse Attention
Liu Liu
Zheng Qu
Zhaodong Chen
Yufei Ding
Yuan Xie
19
20
0
21 Oct 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
19
44
0
20 Oct 2021
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
27
117
0
19 Oct 2021
Compositional Attention: Disentangling Search and Retrieval
Sarthak Mittal
Sharath Chandra Raparthy
Irina Rish
Yoshua Bengio
Guillaume Lajoie
22
20
0
18 Oct 2021
Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research
Ross Gruetzemacher
D. Paradice
25
30
0
18 Oct 2021
3D-RETR: End-to-End Single and Multi-View 3D Reconstruction with Transformers
Z. Shi
Zhao Meng
Yiran Xing
Yunpu Ma
Roger Wattenhofer
ViT
40
34
0
17 Oct 2021
Improving Transformers with Probabilistic Attention Keys
Tam Nguyen
T. Nguyen
Dung D. Le
Duy Khuong Nguyen
Viet-Anh Tran
Richard G. Baraniuk
Nhat Ho
Stanley J. Osher
53
32
0
16 Oct 2021
On Learning the Transformer Kernel
Sankalan Pal Chowdhury
Adamos Solomou
Kumar Avinava Dubey
Mrinmaya Sachan
ViT
56
14
0
15 Oct 2021
StreaMulT: Streaming Multimodal Transformer for Heterogeneous and Arbitrary Long Sequential Data
Victor Pellegrain
Myriam Tami
M. Batteux
C´eline Hudelot
AI4TS
28
2
0
15 Oct 2021
How Does Momentum Benefit Deep Neural Networks Architecture Design? A Few Case Studies
Bao Wang
Hedi Xia
T. Nguyen
Stanley Osher
AI4CE
45
10
0
13 Oct 2021
Leveraging redundancy in attention with Reuse Transformers
Srinadh Bhojanapalli
Ayan Chakrabarti
Andreas Veit
Michal Lukasik
Himanshu Jain
Frederick Liu
Yin-Wen Chang
Sanjiv Kumar
26
23
0
13 Oct 2021
StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning
Jinghuan Shang
Kumara Kahatapitiya
Xiang Li
Michael S. Ryoo
OffRL
45
33
0
12 Oct 2021
LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Xiaohui Wang
Yang Wei
Ying Xiong
Guyue Huang
Xian Qian
Yufei Ding
Mingxuan Wang
Lei Li
VLM
16
30
0
12 Oct 2021
DCT: Dynamic Compressive Transformer for Modeling Unbounded Sequence
Kai-Po Chang
Wei-Yun Ma
14
0
0
10 Oct 2021
Paperswithtopic: Topic Identification from Paper Title Only
Daehyun Cho
C. Wallraven
21
0
0
09 Oct 2021
Hybrid Random Features
K. Choromanski
Haoxian Chen
Han Lin
Yuanzhe Ma
Arijit Sehanobish
...
Andy Zeng
Valerii Likhosherstov
Dmitry Kalashnikov
Vikas Sindhwani
Adrian Weller
31
21
0
08 Oct 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
76
66
0
08 Oct 2021
Efficient and Private Federated Learning with Partially Trainable Networks
Hakim Sidahmed
Zheng Xu
Ankush Garg
Yuan Cao
Mingqing Chen
FedML
49
13
0
06 Oct 2021
ABC: Attention with Bounded-memory Control
Hao Peng
Jungo Kasai
Nikolaos Pappas
Dani Yogatama
Zhaofeng Wu
Lingpeng Kong
Roy Schwartz
Noah A. Smith
76
22
0
06 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
33
3
0
06 Oct 2021
PoNet: Pooling Network for Efficient Token Mixing in Long Sequences
Chao-Hong Tan
Qian Chen
Wen Wang
Qinglin Zhang
Siqi Zheng
Zhenhua Ling
ViT
22
11
0
06 Oct 2021
Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems
Subhabrata Dutta
Tanya Gautam
Soumen Chakrabarti
Tanmoy Chakraborty
51
15
0
30 Sep 2021
SCIMAT: Science and Mathematics Dataset
Neeraj Kollepara
Snehith Kumar Chatakonda
Kiran Ravish
18
1
0
30 Sep 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
114
20
0
29 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
Long-Range Transformers for Dynamic Spatiotemporal Forecasting
J. E. Grigsby
Zhe Wang
Nam Nguyen
Yanjun Qi
AI4TS
69
88
0
24 Sep 2021
Predicting Attention Sparsity in Transformers
Marcos Vinícius Treviso
António Góis
Patrick Fernandes
E. Fonseca
André F. T. Martins
37
13
0
24 Sep 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
206
111
0
22 Sep 2021
Audiomer: A Convolutional Transformer For Keyword Spotting
Surya Kant Sahu
Sai Mitheran
Juhi Kamdar
Meet Gandhi
40
8
0
21 Sep 2021
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
25
81
0
19 Sep 2021
Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy
Colin B. Clement
Shuai Lu
Xiaoyu Liu
Michele Tufano
Dawn Drain
Nan Duan
Neel Sundaresan
Alexey Svyatkovskiy
35
27
0
17 Sep 2021
Sparse Factorization of Large Square Matrices
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
6
2
0
16 Sep 2021
Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental NLU
Patrick Kahardipraja
Brielen Madureira
David Schlangen
CLL
14
20
0
15 Sep 2021
PnP-DETR: Towards Efficient Visual Analysis with Transformers
Tao Wang
Li Yuan
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
ViT
24
84
0
15 Sep 2021
SHAPE: Shifted Absolute Position Embedding for Transformers
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
Kentaro Inui
236
45
0
13 Sep 2021
Deciphering Environmental Air Pollution with Large Scale City Data
Mayukh Bhattacharyya
Sayan Nag
Udita Ghosh
AI4CE
19
5
0
09 Sep 2021
The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning
Yujin Tang
David R Ha
24
75
0
07 Sep 2021
Skim-Attention: Learning to Focus via Document Layout
Laura Nguyen
Thomas Scialom
Jacopo Staiano
Benjamin Piwowarski
27
9
0
02 Sep 2021
∞
\infty
∞
-former: Infinite Memory Transformer
Pedro Henrique Martins
Zita Marinho
André F. T. Martins
36
11
0
01 Sep 2021
Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning
Ran Tian
Joshua Maynez
Ankur P. Parikh
ViT
31
2
0
30 Aug 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
26
12
0
24 Aug 2021
Automated Identification of Cell Populations in Flow Cytometry Data with Transformers
Matthias Wödlinger
Michael Reiter
Lisa Weijler
Margarita Maurer-Granofszky
A. Schumich
...
Stefanie Groeneveld-Krentz
Jorge G. Rossi
Leonid Karawajew
Richard Ratei
Michael N. Dworzak
MedIm
20
15
0
23 Aug 2021
Neural Operator: Learning Maps Between Function Spaces
Nikola B. Kovachki
Zong-Yi Li
Burigede Liu
Kamyar Azizzadenesheli
K. Bhattacharya
Andrew M. Stuart
Anima Anandkumar
AI4CE
52
441
0
19 Aug 2021
Towards Efficient Point Cloud Graph Neural Networks Through Architectural Simplification
Shyam A. Tailor
R. D. Jong
Tiago Azevedo
Matthew Mattina
Partha P. Maji
3DPC
GNN
28
12
0
13 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLM
LM&MA
26
261
0
12 Aug 2021
Adaptive Multi-Resolution Attention with Linear Complexity
Yao Zhang
Yunpu Ma
T. Seidl
Volker Tresp
15
1
0
10 Aug 2021
Expressive Power and Loss Surfaces of Deep Learning Models
S. Dube
26
0
0
08 Aug 2021
Global Self-Attention as a Replacement for Graph Convolution
Md Shamim Hussain
Mohammed J Zaki
D. Subramanian
ViT
40
123
0
07 Aug 2021
Token Shift Transformer for Video Classification
Hao Zhang
Y. Hao
Chong-Wah Ngo
ViT
29
116
0
05 Aug 2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
29
35
0
05 Aug 2021
Previous
1
2
3
...
17
18
19
20
21
Next