Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.14794
Cited By
Rethinking Attention with Performers
30 September 2020
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
Tamás Sarlós
Peter Hawkins
Jared Davis
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Attention with Performers"
50 / 1,019 papers shown
Title
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
32
94
0
30 Mar 2022
Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation
Chao Chen
Haoyu Geng
Nianzu Yang
Junchi Yan
Daiyue Xue
Jianping Yu
Xiaokang Yang
HAI
AI4TS
27
11
0
30 Mar 2022
Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition
J. Sun
Guiping Zhong
Dinghao Zhou
Baoxiang Li
Yiran Zhong
36
7
0
29 Mar 2022
Protein language models trained on multiple sequence alignments learn phylogenetic relationships
Umberto Lupo
Damiano Sgarbossa
Anne-Florence Bitbol
27
35
0
29 Mar 2022
REGTR: End-to-end Point Cloud Correspondences with Transformers
Zi Jian Yew
Gim Hee Lee
3DPC
ViT
43
172
0
28 Mar 2022
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection
Xin Huang
A. Khetan
Rene Bidart
Zohar Karnin
19
14
0
27 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
59
293
0
27 Mar 2022
SMARAGD: Learning SMatch for Accurate and Rapid Approximate Graph Distance
Juri Opitz
Philipp Meier
Anette Frank
21
1
0
24 Mar 2022
On the link between conscious function and general intelligence in humans and machines
Arthur Juliani
Kai Arulkumaran
Shuntaro Sasai
Ryota Kanai
42
26
0
24 Mar 2022
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
30
5
0
23 Mar 2022
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention
Yang Liu
Jiaxiang Liu
L. Chen
Yuxiang Lu
Shi Feng
Zhida Feng
Yu Sun
Hao Tian
Huancheng Wu
Hai-feng Wang
31
9
0
23 Mar 2022
DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers
Yan Yang
Zakir Hossain
Khan Asif
Liyuan Pan
Shafin Rahman
Eric A. Stone
14
4
0
23 Mar 2022
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger
Thomas Paniagua
Xi Song
Naresh P. Cuntoor
Mun Wai Lee
Tianfu Wu
ViT
15
8
0
22 Mar 2022
MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer
Kuan-Chih Huang
Tsung-Han Wu
Hung-Ting Su
Winston H. Hsu
ViT
MDE
15
159
0
21 Mar 2022
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
30
173
0
16 Mar 2022
Long Document Summarization with Top-down and Bottom-up Inference
Bo Pang
Erik Nijkamp
Wojciech Kry'sciñski
Silvio Savarese
Yingbo Zhou
Caiming Xiong
RALM
BDL
24
55
0
15 Mar 2022
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
28
94
0
11 Mar 2022
WaveMix: Resource-efficient Token Mixing for Images
Pranav Jeevan
A. Sethi
25
10
0
07 Mar 2022
Uniform Approximations for Randomized Hadamard Transforms with Applications
Yeshwanth Cherapanamjeri
Jelani Nelson
41
11
0
03 Mar 2022
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform
Carmelo Scribano
Giorgia Franchini
M. Prato
Marko Bertogna
18
21
0
02 Mar 2022
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
Shenggan Cheng
Xuanlei Zhao
Guangyang Lu
Bin-Rui Li
Zhongming Yu
Tian Zheng
R. Wu
Xiwen Zhang
Jian Peng
Yang You
AI4CE
27
30
0
02 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
36
22
0
28 Feb 2022
CTformer: Convolution-free Token2Token Dilated Vision Transformer for Low-dose CT Denoising
Dayang Wang
Fenglei Fan
Zhan Wu
R. Liu
Fei Wang
Hengyong Yu
ViT
MedIm
40
122
0
28 Feb 2022
Factorizer: A Scalable Interpretable Approach to Context Modeling for Medical Image Segmentation
Pooya Ashtari
Diana Sima
L. De Lathauwer
D. Sappey-Marinier
F. Maes
Sabine Van Huffel
ViT
MedIm
31
35
0
24 Feb 2022
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
Maksim Zubkov
Daniil Gavrilov
27
0
0
23 Feb 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
81
221
0
21 Feb 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
33
230
0
21 Feb 2022
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
38
212
0
17 Feb 2022
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
31
333
0
16 Feb 2022
The NLP Task Effectiveness of Long-Range Transformers
Guanghui Qin
Yukun Feng
Benjamin Van Durme
18
28
0
16 Feb 2022
Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations
Youwei Liang
Chongjian Ge
Zhan Tong
Yibing Song
Jue Wang
P. Xie
ViT
25
237
0
16 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens
Felix Ott
David Rügamer
Lucas Heublein
Tim Hamann
Jens Barth
Bernd Bischl
Christopher Mutschler
30
18
0
14 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
24
42
0
11 Feb 2022
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
Kazuki Irie
Imanol Schlag
Róbert Csordás
Jürgen Schmidhuber
14
26
0
11 Feb 2022
How to Understand Masked Autoencoders
Shuhao Cao
Peng Xu
David Clifton
29
40
0
08 Feb 2022
TACTiS: Transformer-Attentional Copulas for Time Series
Alexandre Drouin
Étienne Marcotte
Nicolas Chapados
AI4TS
157
37
0
07 Feb 2022
Patch-Based Stochastic Attention for Image Editing
Nicolas Cherel
Andrés Almansa
Y. Gousseau
A. Newson
25
6
0
07 Feb 2022
TorchMD-NET: Equivariant Transformers for Neural Network based Molecular Potentials
Philipp Thölke
Gianni De Fabritiis
AI4CE
39
186
0
05 Feb 2022
Nonlinear Initialization Methods for Low-Rank Neural Networks
Kiran Vodrahalli
Rakesh Shivanna
M. Sathiamoorthy
Sagar Jain
Ed H. Chi
19
4
0
02 Feb 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Hyunjun Kim
Jeonggil Ko
17
2
0
30 Jan 2022
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Tian Zhou
Ziqing Ma
Qingsong Wen
Xue Wang
Liang Sun
Rong Jin
AI4TS
30
1,317
0
30 Jan 2022
ShapeFormer: Transformer-based Shape Completion via Sparse Representation
Xingguang Yan
Liqiang Lin
Niloy J. Mitra
Dani Lischinski
Daniel Cohen-Or
Hui Huang
ViT
76
114
0
25 Jan 2022
Convolutional Xformers for Vision
Pranav Jeevan
Amit Sethi
ViT
55
12
0
25 Jan 2022
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
Fahad Shahbaz Khan
Huazhu Fu
ViT
LM&MA
MedIm
111
663
0
24 Jan 2022
glassoformer: a query-sparse transformer for post-fault power grid voltage prediction
Yunling Zheng
Carson Hu
Guang Lin
Meng Yue
Bao Wang
Jack Xin
76
3
0
22 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
48
261
0
21 Jan 2022
Improved Random Features for Dot Product Kernels
Jonas Wacker
Motonobu Kanagawa
Maurizio Filippone
27
8
0
21 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
27
11
0
17 Jan 2022
Previous
1
2
3
...
15
16
17
...
19
20
21
Next