ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.14794
  4. Cited By
Rethinking Attention with Performers

Rethinking Attention with Performers

30 September 2020
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
Tamás Sarlós
Peter Hawkins
Jared Davis
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
ArXivPDFHTML

Papers citing "Rethinking Attention with Performers"

50 / 1,019 papers shown
Title
Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model
Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model
Xiulong Yang
Sheng-Min Shih
Yinlin Fu
Xiaoting Zhao
Shihao Ji
DiffM
33
56
0
16 Aug 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
26
2
0
11 Aug 2022
Investigating Efficiently Extending Transformers for Long Input
  Summarization
Investigating Efficiently Extending Transformers for Long Input Summarization
Jason Phang
Yao-Min Zhao
Peter J. Liu
RALM
LLMAG
42
63
0
08 Aug 2022
FourCastNet: Accelerating Global High-Resolution Weather Forecasting
  using Adaptive Fourier Neural Operators
FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators
Thorsten Kurth
Shashank Subramanian
P. Harrington
Jaideep Pathak
Morteza Mardani
D. Hall
Andrea Miele
K. Kashinath
Anima Anandkumar
AI4Cl
40
174
0
08 Aug 2022
Global Hierarchical Attention for 3D Point Cloud Analysis
Global Hierarchical Attention for 3D Point Cloud Analysis
Dan Jia
Alexander Hermans
Bastian Leibe
3DPC
21
0
0
07 Aug 2022
SpanDrop: Simple and Effective Counterfactual Learning for Long
  Sequences
SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences
Peng Qi
Guangtao Wang
Jing Huang
24
0
0
03 Aug 2022
Implicit Two-Tower Policies
Implicit Two-Tower Policies
Yunfan Zhao
Qingkai Pan
K. Choromanski
Deepali Jain
Vikas Sindhwani
OffRL
36
3
0
02 Aug 2022
Efficient Long-Text Understanding with Short-Text Models
Efficient Long-Text Understanding with Short-Text Models
Maor Ivgi
Uri Shaham
Jonathan Berant
VLM
38
76
0
01 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention
  and Its Linearization
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
47
9
0
01 Aug 2022
GTrans: Grouping and Fusing Transformer Layers for Neural Machine
  Translation
GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation
Jian Yang
Yuwei Yin
Liqun Yang
Shuming Ma
Haoyang Huang
Dongdong Zhang
Furu Wei
Zhoujun Li
AI4CE
22
16
0
29 Jul 2022
Neural Architecture Search on Efficient Transformers and Beyond
Neural Architecture Search on Efficient Transformers and Beyond
Zexiang Liu
Dong Li
Kaiyue Lu
Zhen Qin
Weixuan Sun
Jiacheng Xu
Yiran Zhong
35
19
0
28 Jul 2022
Scaling Laws vs Model Architectures: How does Inductive Bias Influence
  Scaling?
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Yi Tay
Mostafa Dehghani
Samira Abnar
Hyung Won Chung
W. Fedus
J. Rao
Sharan Narang
Vinh Q. Tran
Dani Yogatama
Donald Metzler
AI4CE
34
100
0
21 Jul 2022
Multi Resolution Analysis (MRA) for Approximate Self-Attention
Multi Resolution Analysis (MRA) for Approximate Self-Attention
Zhanpeng Zeng
Sourav Pal
Jeffery Kline
G. Fung
Vikas Singh
23
6
0
21 Jul 2022
GenHPF: General Healthcare Predictive Framework with Multi-task
  Multi-source Learning
GenHPF: General Healthcare Predictive Framework with Multi-task Multi-source Learning
Kyunghoon Hur
Jungwoo Oh
Junu Kim
Jiyoun Kim
Min Jae Lee
Eunbyeol Choi
Seong-Eun Moon
Young-Hak Kim
Louis Atallah
Edward Choi
AI4TS
22
22
0
20 Jul 2022
OTPose: Occlusion-Aware Transformer for Pose Estimation in
  Sparsely-Labeled Videos
OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos
Kyung-Min Jin
Gun-Hee Lee
Seongyeong Lee
ViT
39
12
0
20 Jul 2022
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Xiaokang Chen
Fangyun Wei
Gang Zeng
Jingdong Wang
ViT
30
33
0
18 Jul 2022
Multi-manifold Attention for Vision Transformers
Multi-manifold Attention for Vision Transformers
D. Konstantinidis
Ilias Papastratis
K. Dimitropoulos
P. Daras
ViT
27
16
0
18 Jul 2022
Recurrent Memory Transformer
Recurrent Memory Transformer
Aydar Bulatov
Yuri Kuratov
Andrey Kravchenko
CLL
15
103
0
14 Jul 2022
QSAN: A Near-term Achievable Quantum Self-Attention Network
QSAN: A Near-term Achievable Quantum Self-Attention Network
Jinjing Shi
Ren-Xin Zhao
Wenxuan Wang
Shenmin Zhang
Xuelong Li
28
20
0
14 Jul 2022
DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation
DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation
Songhua Liu
Jingwen Ye
Sucheng Ren
Xinchao Wang
27
44
0
13 Jul 2022
AGBoost: Attention-based Modification of Gradient Boosting Machine
AGBoost: Attention-based Modification of Gradient Boosting Machine
A. Konstantinov
Lev V. Utkin
Stanislav R. Kirpichenko
ODL
21
7
0
12 Jul 2022
Attention and Self-Attention in Random Forests
Attention and Self-Attention in Random Forests
Lev V. Utkin
A. Konstantinov
45
3
0
09 Jul 2022
FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech
  Synthesis
FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis
Yongqiang Wang
Zhou Zhao
19
10
0
08 Jul 2022
Vision Transformers: State of the Art and Research Challenges
Vision Transformers: State of the Art and Research Challenges
Bo-Kai Ruan
Hong-Han Shuai
Wen-Huang Cheng
ViT
30
17
0
07 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
45
191
0
06 Jul 2022
Softmax-free Linear Transformers
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
21
7
0
05 Jul 2022
Compute Cost Amortized Transformer for Streaming ASR
Compute Cost Amortized Transformer for Streaming ASR
Yifan Xie
J. Macoskey
Martin H. Radfar
Feng-Ju Chang
Brian King
Ariya Rastrow
Athanasios Mouchtaris
Grant P. Strimel
30
7
0
05 Jul 2022
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Cheng-rong Li
Yangxin Liu
42
0
0
01 Jul 2022
Learning Functions on Multiple Sets using Multi-Set Transformers
Learning Functions on Multiple Sets using Multi-Set Transformers
Kira A. Selby
Ahmad Rashid
I. Kobyzev
Mehdi Rezagholizadeh
Pascal Poupart
ViT
30
1
0
30 Jun 2022
Deformable Graph Transformer
Deformable Graph Transformer
Jinyoung Park
Seongjun Yun
Hyeon-ju Park
Jaewoo Kang
Jisu Jeong
KyungHyun Kim
Jung-Woo Ha
Hyunwoo J. Kim
93
7
0
29 Jun 2022
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech
  Separation
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation
Jian Luo
Jianzong Wang
Ning Cheng
Edward Xiao
Xulong Zhang
Jing Xiao
ViT
37
12
0
28 Jun 2022
Long Range Language Modeling via Gated State Spaces
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
39
232
0
27 Jun 2022
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive
  Learning
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
Cheng Tan
Zhangyang Gao
Lirong Wu
Yongjie Xu
Jun Xia
Siyuan Li
Stan Z. Li
51
107
0
24 Jun 2022
Vicinity Vision Transformer
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
44
32
0
21 Jun 2022
All you need is feedback: Communication with block attention feedback
  codes
All you need is feedback: Communication with block attention feedback codes
Emre Ozfatura
Yulin Shao
A. Perotti
B. Popović
Deniz Gunduz
22
10
0
19 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
44
32
0
19 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
26
25
0
17 Jun 2022
Born for Auto-Tagging: Faster and better with new objective functions
Born for Auto-Tagging: Faster and better with new objective functions
Chiung-ju Liu
Huang-Ting Shieh
30
1
0
15 Jun 2022
Peripheral Vision Transformer
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
32
30
0
14 Jun 2022
Recurrent Transformer Variational Autoencoders for Multi-Action Motion
  Synthesis
Recurrent Transformer Variational Autoencoders for Multi-Action Motion Synthesis
Rania Briq
Chuhang Zou
L. Pishchulin
Christopher Broaddus
Juergen Gall
29
1
0
14 Jun 2022
ChordMixer: A Scalable Neural Attention Model for Sequences with
  Different Lengths
ChordMixer: A Scalable Neural Attention Model for Sequences with Different Lengths
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
33
12
0
12 Jun 2022
Bootstrapping Multi-view Representations for Fake News Detection
Bootstrapping Multi-view Representations for Fake News Detection
Qichao Ying
Xiaoxiao Hu
Yangming Zhou
Zhenxing Qian
Dan Zeng
Shiming Ge
29
45
0
12 Jun 2022
SparseFormer: Attention-based Depth Completion Network
SparseFormer: Attention-based Depth Completion Network
Frederik Warburg
Michael Ramamonjisoa
Manuel López-Antequera
MoE
MDE
29
4
0
09 Jun 2022
Scaleformer: Iterative Multi-scale Refining Transformers for Time Series
  Forecasting
Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting
Amin Shabani
A. Abdi
Li Meng
Tristan Sylvain
AI4TS
29
61
0
08 Jun 2022
Separable Self-attention for Mobile Vision Transformers
Separable Self-attention for Mobile Vision Transformers
Sachin Mehta
Mohammad Rastegari
ViT
MQ
34
253
0
06 Jun 2022
DeeprETA: An ETA Post-processing System at Scale
DeeprETA: An ETA Post-processing System at Scale
Xinyu Hu
Tanmay Binaykiya
Eric C. Frank
Olcay Cirit
33
13
0
05 Jun 2022
EAANet: Efficient Attention Augmented Convolutional Networks
EAANet: Efficient Attention Augmented Convolutional Networks
Runqing Zhang
Tianshu Zhu
30
0
0
03 Jun 2022
Transforming medical imaging with Transformers? A comparative review of
  key properties, current progresses, and future perspectives
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
28
24
0
02 Jun 2022
BayesFormer: Transformer with Uncertainty Estimation
BayesFormer: Transformer with Uncertainty Estimation
Karthik Abinav Sankararaman
Sinong Wang
Han Fang
UQCV
BDL
30
10
0
02 Jun 2022
Fair Comparison between Efficient Attentions
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
28
1
0
01 Jun 2022
Previous
123...131415...192021
Next