Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.14794
Cited By
Rethinking Attention with Performers
30 September 2020
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
Tamás Sarlós
Peter Hawkins
Jared Davis
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Attention with Performers"
50 / 1,019 papers shown
Title
Linear Video Transformer with Feature Fixation
Kaiyue Lu
Zexia Liu
Jianyuan Wang
Weixuan Sun
Zhen Qin
...
Xuyang Shen
Huizhong Deng
Xiaodong Han
Yuchao Dai
Yiran Zhong
35
4
0
15 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
46
9
0
14 Oct 2022
LSG Attention: Extrapolation of pretrained Transformers to long sequences
Charles Condevaux
S. Harispe
40
24
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
36
47
0
13 Oct 2022
Designing Robust Transformers using Robust Kernel Density Estimation
Xing Han
Tongzheng Ren
T. Nguyen
Khai Nguyen
Joydeep Ghosh
Nhat Ho
32
6
0
11 Oct 2022
An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification
Ilias Chalkidis
Xiang Dai
Manos Fergadiotis
Prodromos Malakasiotis
Desmond Elliott
44
34
0
11 Oct 2022
LARF: Two-level Attention-based Random Forests with a Mixture of Contamination Models
A. Konstantinov
Lev V. Utkin
49
0
0
11 Oct 2022
Turbo Training with Token Dropout
Tengda Han
Weidi Xie
Andrew Zisserman
ViT
39
10
0
10 Oct 2022
Self-explaining Hierarchical Model for Intraoperative Time Series
Dingwen Li
Bing Xue
C. King
Bradley A. Fritz
M. Avidan
Joanna Abraham
Chenyang Lu
AI4CE
21
3
0
10 Oct 2022
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
H. H. Mao
71
21
0
09 Oct 2022
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
Kazuki Irie
Jürgen Schmidhuber
39
5
0
07 Oct 2022
Temporally Consistent Transformers for Video Generation
Wilson Yan
Danijar Hafner
Stephen James
Pieter Abbeel
DiffM
27
28
0
05 Oct 2022
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability
Yufan Zhuang
Zihan Wang
Fangbo Tao
Jingbo Shang
ViT
AI4TS
37
3
0
05 Oct 2022
Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective
Peter Baumgartner
Daniel V. Smith
Mashud Rana
Reena Kapoor
Elena Tartaglia
A. Schutt
Ashfaqur Rahman
John Taylor
S. Dunstall
32
4
0
04 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
45
26
0
03 Oct 2022
DARTFormer: Finding The Best Type Of Attention
Jason Brown
Yiren Zhao
Ilia Shumailov
Robert D. Mullins
32
6
0
02 Oct 2022
Wide Attention Is The Way Forward For Transformers?
Jason Brown
Yiren Zhao
Ilia Shumailov
Robert D. Mullins
21
7
0
02 Oct 2022
Grouped self-attention mechanism for a memory-efficient Transformer
Bumjun Jung
Yusuke Mukuta
Tatsuya Harada
AI4TS
14
3
0
02 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
Transformer Meets Boundary Value Inverse Problems
Ruchi Guo
Shuhao Cao
Long Chen
MedIm
38
21
0
29 Sep 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition
Martin H. Radfar
Rohit Barnwal
Rupak Vignesh Swaminathan
Feng-Ju Chang
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
39
13
0
29 Sep 2022
Spikformer: When Spiking Neural Network Meets Transformer
Zhaokun Zhou
Yuesheng Zhu
Chao He
Yaowei Wang
Shuicheng Yan
Yonghong Tian
Liuliang Yuan
147
249
0
29 Sep 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
30
23
0
28 Sep 2022
Searching a High-Performance Feature Extractor for Text Recognition Network
Hui Zhang
Quanming Yao
James T. Kwok
X. Bai
30
7
0
27 Sep 2022
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
107
98
0
26 Sep 2022
From One to Many: Dynamic Cross Attention Networks for LiDAR and Camera Fusion
Rui Wan
Shuangjie Xu
Wei Wu
Xiaoyi Zou
Tongyi Cao
3DPC
20
4
0
25 Sep 2022
Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation
Xuesu Xiao
Tingnan Zhang
K. Choromanski
Edward J. Lee
Anthony G. Francis
...
Leila Takayama
Roy Frostig
Jie Tan
Carolina Parada
Vikas Sindhwani
77
55
0
22 Sep 2022
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
38
183
0
21 Sep 2022
Adapting Pretrained Text-to-Text Models for Long Text Sequences
Wenhan Xiong
Anchit Gupta
Shubham Toshniwal
Yashar Mehdad
Wen-tau Yih
RALM
VLM
62
30
0
21 Sep 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
40
58
0
20 Sep 2022
Graph Reasoning Transformer for Image Parsing
Dong Zhang
Jinhui Tang
Kwang-Ting Cheng
ViT
26
16
0
20 Sep 2022
Real-time Online Video Detection with Temporal Smoothing Transformers
Yue Zhao
Philipp Krahenbuhl
ViT
69
57
0
19 Sep 2022
Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach
Shih-Lun Wu
Yi-Hsuan Yang
56
14
0
17 Sep 2022
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
99
78
0
15 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
82
31
0
14 Sep 2022
Multiple View Performers for Shape Completion
David Watkins-Valls
Peter K. Allen
K. Choromanski
Jacob Varley
Nicholas R. Waytowich
22
1
0
13 Sep 2022
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Kartik Audhkhasi
Yinghui Huang
Bhuvana Ramabhadran
Pedro J. Moreno
27
3
0
13 Sep 2022
SkIn: Skimming-Intensive Long-Text Classification Using BERT for Medical Corpus
Yufeng Zhao
Haiying Che
VLM
29
0
0
13 Sep 2022
Graph Neural Networks for Molecules
Yuyang Wang
Zijie Li
A. Farimani
GNN
AI4CE
52
21
0
12 Sep 2022
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
Chinmay Hegde
73
111
0
11 Sep 2022
Pre-Training a Graph Recurrent Network for Language Representation
Yile Wang
Linyi Yang
Zhiyang Teng
M. Zhou
Yue Zhang
GNN
38
1
0
08 Sep 2022
Morphology-preserving Autoregressive 3D Generative Modelling of the Brain
Petru-Daniel Tudosiu
W. H. Pinaya
M. Graham
Pedro Borges
Virginia Fernandez
...
Disha Mehra
M. Vella
P. Nachev
Sebastien Ourselin
M. Jorge Cardoso
3DH
DiffM
MedIm
27
20
0
07 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
73
575
0
07 Sep 2022
Extend and Explain: Interpreting Very Long Language Models
Joel Stremmel
B. Hill
Jeffrey S. Hertzberg
Jaime Murillo
Llewelyn Allotey
Eran Halperin
17
4
0
02 Sep 2022
Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation
Amir Yazdanbakhsh
Ashkan Moradifirouzabadi
Zheng Li
Mingu Kang
31
32
0
01 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
40
109
0
31 Aug 2022
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
54
6
0
30 Aug 2022
Transfer Ranking in Finance: Applications to Cross-Sectional Momentum with Data Scarcity
Daniel Poh
Stephen J. Roberts
S. Zohren
31
7
0
21 Aug 2022
An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics
Aly M. Kassem
Omar Mohamed
Ali Ashraf
Ahmed Elbehery
Salma Jamal
Anas Salah
A. Ghoneim
25
3
0
20 Aug 2022
Treeformer: Dense Gradient Trees for Efficient Attention Computation
Lovish Madaan
Srinadh Bhojanapalli
Himanshu Jain
Prateek Jain
35
6
0
18 Aug 2022
Previous
1
2
3
...
12
13
14
...
19
20
21
Next