ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.14794
  4. Cited By
Rethinking Attention with Performers

Rethinking Attention with Performers

30 September 2020
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
Tamás Sarlós
Peter Hawkins
Jared Davis
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
ArXivPDFHTML

Papers citing "Rethinking Attention with Performers"

50 / 1,014 papers shown
Title
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
63
1,224
0
22 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
38
2,176
0
20 Apr 2021
Text Guide: Improving the quality of long text classification by a text
  selection method based on feature importance
Text Guide: Improving the quality of long text classification by a text selection method based on feature importance
K. Fiok
W. Karwowski
Edgar Gutierrez-Franco
Mohammad Reza Davahli
Maciej Wilamowski
T. Ahram
Awad M. Aljuaid
Jozef Zurada
VLM
28
33
0
15 Apr 2021
Sparse Attention with Linear Units
Sparse Attention with Linear Units
Biao Zhang
Ivan Titov
Rico Sennrich
11
38
0
14 Apr 2021
Efficient conformer-based speech recognition with linear attention
Efficient conformer-based speech recognition with linear attention
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
24
20
0
14 Apr 2021
Co-Scale Conv-Attentional Image Transformers
Co-Scale Conv-Attentional Image Transformers
Weijian Xu
Yifan Xu
Tyler A. Chang
Z. Tu
ViT
27
374
0
13 Apr 2021
Updater-Extractor Architecture for Inductive World State Representations
Updater-Extractor Architecture for Inductive World State Representations
A. Moskvichev
James Liu
9
4
0
12 Apr 2021
Transformers: "The End of History" for NLP?
Transformers: "The End of History" for NLP?
Anton Chernyavskiy
Dmitry Ilvovsky
Preslav Nakov
47
30
0
09 Apr 2021
LoFTR: Detector-Free Local Feature Matching with Transformers
LoFTR: Detector-Free Local Feature Matching with Transformers
Jiaming Sun
Zehong Shen
Yuang Wang
Hujun Bao
Xiaowei Zhou
ViT
31
1,141
0
01 Apr 2021
Charged particle tracking via edge-classifying interaction networks
Charged particle tracking via edge-classifying interaction networks
G. Dezoort
S. Thais
Javier Mauricio Duarte
Vesal Razavimaleki
M. Atkinson
I. Ojalvo
Mark S. Neubauer
P. Elmer
25
46
0
30 Mar 2021
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Linbo Jin
Ben Chen
Hao Zhou
Minghui Qiu
Ling Shao
VLM
30
120
0
30 Mar 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
30
2,088
0
29 Mar 2021
Multi-Scale Vision Longformer: A New Vision Transformer for
  High-Resolution Image Encoding
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
Pengchuan Zhang
Xiyang Dai
Jianwei Yang
Bin Xiao
Lu Yuan
Lei Zhang
Jianfeng Gao
ViT
29
329
0
29 Mar 2021
A Practical Survey on Faster and Lighter Transformers
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
14
93
0
26 Mar 2021
High-Fidelity Pluralistic Image Completion with Transformers
High-Fidelity Pluralistic Image Completion with Transformers
Bo Liu
Jingbo Zhang
Dongdong Chen
Jing Liao
ViT
28
231
0
25 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
145
20,710
0
25 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
  Architectures
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
60
92
0
23 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
27
126
0
19 Mar 2021
Value-aware Approximate Attention
Value-aware Approximate Attention
Ankit Gupta
Jonathan Berant
18
5
0
17 Mar 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
  Representation
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
36
210
0
11 Mar 2021
Deep Generative Modelling: A Comparative Review of VAEs, GANs,
  Normalizing Flows, Energy-Based and Autoregressive Models
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models
Sam Bond-Taylor
Adam Leach
Yang Long
Chris G. Willcocks
VLM
TPM
41
481
0
08 Mar 2021
Generating Images with Sparse Representations
Generating Images with Sparse Representations
C. Nash
Jacob Menick
Sander Dieleman
Peter W. Battaglia
33
199
0
05 Mar 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
91
976
0
04 Mar 2021
Random Feature Attention
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
36
348
0
03 Mar 2021
OmniNet: Omnidirectional Representations from Transformers
OmniNet: Omnidirectional Representations from Transformers
Yi Tay
Mostafa Dehghani
V. Aribandi
Jai Gupta
Philip Pham
Zhen Qin
Dara Bahri
Da-Cheng Juan
Donald Metzler
47
26
0
01 Mar 2021
Chess as a Testbed for Language Model State Tracking
Chess as a Testbed for Language Model State Tracking
Shubham Toshniwal
Sam Wiseman
Karen Livescu
Kevin Gimpel
35
48
0
26 Feb 2021
Automated essay scoring using efficient transformer-based language
  models
Automated essay scoring using efficient transformer-based language models
C. Ormerod
Akanksha Malhotra
Amir Jafari
21
30
0
25 Feb 2021
LazyFormer: Self Attention with Lazy Update
LazyFormer: Self Attention with Lazy Update
Chengxuan Ying
Guolin Ke
Di He
Tie-Yan Liu
25
15
0
25 Feb 2021
Unsupervised Brain Anomaly Detection and Segmentation with Transformers
Unsupervised Brain Anomaly Detection and Segmentation with Transformers
W. H. Pinaya
Petru-Daniel Tudosiu
Robert J. Gray
G. Rees
P. Nachev
Sebastien Ourselin
M. Jorge Cardoso
ViT
MedIm
14
59
0
23 Feb 2021
Linear Transformers Are Secretly Fast Weight Programmers
Linear Transformers Are Secretly Fast Weight Programmers
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
43
225
0
22 Feb 2021
Position Information in Transformers: An Overview
Position Information in Transformers: An Overview
Philipp Dufter
Martin Schmitt
Hinrich Schütze
13
139
0
22 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
281
179
0
17 Feb 2021
Translational Equivariance in Kernelizable Attention
Translational Equivariance in Kernelizable Attention
Max Horn
Kumar Shridhar
Elrich Groenewald
Philipp F. M. Baumann
16
7
0
15 Feb 2021
SLAPS: Self-Supervision Improves Structure Learning for Graph Neural
  Networks
SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks
Bahare Fatemi
Layla El Asri
Seyed Mehran Kazemi
GNN
SSL
22
159
0
09 Feb 2021
Unlocking Pixels for Reinforcement Learning via Implicit Attention
Unlocking Pixels for Reinforcement Learning via Implicit Attention
K. Choromanski
Deepali Jain
Wenhao Yu
Xingyou Song
Jack Parker-Holder
...
Aldo Pacchiano
Anirban Santara
Yunhao Tang
Jie Tan
Adrian Weller
OffRL
33
3
0
08 Feb 2021
Nyströmformer: A Nyström-Based Algorithm for Approximating
  Self-Attention
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention
Yunyang Xiong
Zhanpeng Zeng
Rudrasis Chakraborty
Mingxing Tan
G. Fung
Yin Li
Vikas Singh
35
506
0
07 Feb 2021
Adaptive Semiparametric Language Models
Adaptive Semiparametric Language Models
Dani Yogatama
Cyprien de Masson dÁutume
Lingpeng Kong
KELM
RALM
43
97
0
04 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on
  ImageNet
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
55
1,906
0
28 Jan 2021
RomeBERT: Robust Training of Multi-Exit BERT
RomeBERT: Robust Training of Multi-Exit BERT
Shijie Geng
Peng Gao
Zuohui Fu
Yongfeng Zhang
25
26
0
24 Jan 2021
Does Dialog Length matter for Next Response Selection task? An Empirical
  Study
Does Dialog Length matter for Next Response Selection task? An Empirical Study
Jatin Ganhotra
Sachindra Joshi
VLM
25
5
0
24 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,431
0
04 Jan 2021
On-the-Fly Attention Modulation for Neural Generation
On-the-Fly Attention Modulation for Neural Generation
Yue Dong
Chandra Bhagavatula
Ximing Lu
Jena D. Hwang
Antoine Bosselut
Jackie C.K. Cheung
Yejin Choi
43
12
0
02 Jan 2021
Sub-Linear Memory: How to Make Performers SLiM
Sub-Linear Memory: How to Make Performers SLiM
Valerii Likhosherstov
K. Choromanski
Jared Davis
Xingyou Song
Adrian Weller
23
19
0
21 Dec 2020
Noise-Robust End-to-End Quantum Control using Deep Autoregressive Policy
  Networks
Noise-Robust End-to-End Quantum Control using Deep Autoregressive Policy Networks
Jiahao Yao
Paul Köttering
Hans Gundlach
Lin Lin
Marin Bukov
26
14
0
12 Dec 2020
A Singular Value Perspective on Model Robustness
A Singular Value Perspective on Model Robustness
Malhar Jere
Maghav Kumar
F. Koushanfar
AAML
23
6
0
07 Dec 2020
Generative Capacity of Probabilistic Protein Sequence Models
Generative Capacity of Probabilistic Protein Sequence Models
Francisco McGee
Quentin Novinger
R. Levy
Vincenzo Carnevale
A. Haldane
35
34
0
03 Dec 2020
PlueckerNet: Learn to Register 3D Line Reconstructions
PlueckerNet: Learn to Register 3D Line Reconstructions
Liu Liu
Hongdong Li
Haodong Yao
Ruyi Zha
3DPC
3DV
25
6
0
02 Dec 2020
A Survey of Deep Learning Approaches for OCR and Document Understanding
A Survey of Deep Learning Approaches for OCR and Document Understanding
Nishant Subramani
Alexandre Matton
Malcolm Greaves
Adrian Lam
19
48
0
27 Nov 2020
Metric Transforms and Low Rank Matrices via Representation Theory of the
  Real Hyperrectangle
Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle
Josh Alman
T. Chu
Gary Miller
Shyam Narayanan
Mark Sellke
Zhao Song
6
1
0
23 Nov 2020
Classification by Attention: Scene Graph Classification with Prior
  Knowledge
Classification by Attention: Scene Graph Classification with Prior Knowledge
Sahand Sharifzadeh
Sina Moayed Baharlou
Volker Tresp
OCL
22
50
0
19 Nov 2020
Previous
123...192021
Next