ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.14794
  4. Cited By
Rethinking Attention with Performers

Rethinking Attention with Performers

30 September 2020
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
Tamás Sarlós
Peter Hawkins
Jared Davis
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
ArXivPDFHTML

Papers citing "Rethinking Attention with Performers"

50 / 1,019 papers shown
Title
Scalable Transformer for PDE Surrogate Modeling
Scalable Transformer for PDE Surrogate Modeling
Zijie Li
Dule Shu
A. Farimani
35
67
0
27 May 2023
COMCAT: Towards Efficient Compression and Customization of
  Attention-Based Vision Models
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Jinqi Xiao
Miao Yin
Yu Gong
Xiao Zang
Jian Ren
Bo Yuan
VLM
ViT
43
9
0
26 May 2023
Scissorhands: Exploiting the Persistence of Importance Hypothesis for
  LLM KV Cache Compression at Test Time
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Zichang Liu
Aditya Desai
Fangshuo Liao
Weitao Wang
Victor Xie
Zhaozhuo Xu
Anastasios Kyrillidis
Anshumali Shrivastava
28
202
0
26 May 2023
Landmark Attention: Random-Access Infinite Context Length for
  Transformers
Landmark Attention: Random-Access Infinite Context Length for Transformers
Amirkeivan Mohtashami
Martin Jaggi
LLMAG
27
149
0
25 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurelien Lucchi
Thomas Hofmann
42
53
0
25 May 2023
Focus Your Attention (with Adaptive IIR Filters)
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
32
9
0
24 May 2023
Predicting Token Impact Towards Efficient Vision Transformer
Predicting Token Impact Towards Efficient Vision Transformer
Hong Wang
Su Yang
Xiaoke Huang
Weishan Zhang
20
0
0
24 May 2023
Adapting Language Models to Compress Contexts
Adapting Language Models to Compress Contexts
Alexis Chevalier
Alexander Wettig
Anirudh Ajith
Danqi Chen
LLMAG
16
174
0
24 May 2023
Dual Path Transformer with Partition Attention
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
36
2
0
24 May 2023
A Joint Time-frequency Domain Transformer for Multivariate Time Series
  Forecasting
A Joint Time-frequency Domain Transformer for Multivariate Time Series Forecasting
Yushu Chen
Shengzhuo Liu
Jinzhe Yang
Hao Jing
Wenlai Zhao
Guang-Wu Yang
AI4TS
24
15
0
24 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Ziwei He
Meng-Da Yang
Minwei Feng
Jingcheng Yin
Xinbing Wang
Jingwen Leng
Zhouhan Lin
ViT
37
13
0
24 May 2023
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model
  Parallel Inference
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Jinghan Yao
Nawras Alnaasan
Tianrun Chen
Hari Subramoni
Hari Subramoni
Dhabaleswar K.
D. Panda
32
2
0
22 May 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset
  Selection for Language Model
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang
Wei Zhou
Qi Zhang
Jie Zhou
Songyang Gao
Junzhe Wang
Menghan Zhang
Xiang Gao
Yunwen Chen
Tao Gui
43
7
0
22 May 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
32
12
0
22 May 2023
Tokenized Graph Transformer with Neighborhood Augmentation for Node
  Classification in Large Graphs
Tokenized Graph Transformer with Neighborhood Augmentation for Node Classification in Large Graphs
Jinsong Chen
Chang-Shu Liu
Kai-Xin Gao
Gaichao Li
Kun He
29
4
0
22 May 2023
Quasi-Monte Carlo Graph Random Features
Quasi-Monte Carlo Graph Random Features
Isaac Reid
K. Choromanski
Adrian Weller
19
8
0
21 May 2023
Prefix Propagation: Parameter-Efficient Tuning for Long Sequences
Prefix Propagation: Parameter-Efficient Tuning for Long Sequences
Jonathan Li
Will Aitken
R. Bhambhoria
Xiao-Dan Zhu
17
14
0
20 May 2023
Less is More! A slim architecture for optimal language translation
Less is More! A slim architecture for optimal language translation
Luca Herranz-Celotti
E. Rrapaj
36
0
0
18 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
Hao Chen
Jingkuan Song
Feng Zheng
ViT
32
0
0
17 May 2023
SoundStorm: Efficient Parallel Audio Generation
SoundStorm: Efficient Parallel Audio Generation
Zalan Borsos
Matthew Sharifi
Damien Vincent
Eugene Kharitonov
Neil Zeghidour
Marco Tagliasacchi
28
98
0
16 May 2023
DLUE: Benchmarking Document Language Understanding
DLUE: Benchmarking Document Language Understanding
Ruoxi Xu
Hongyu Lin
Xinyan Guan
Xianpei Han
Yingfei Sun
Le Sun
ELM
39
0
0
16 May 2023
SKI to go Faster: Accelerating Toeplitz Neural Networks via Asymmetric
  Kernels
SKI to go Faster: Accelerating Toeplitz Neural Networks via Asymmetric Kernels
Alexander Moreno
Jonathan Mei
Luke Walters
23
0
0
15 May 2023
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
L. Yu
Daniel Simig
Colin Flaherty
Armen Aghajanyan
Luke Zettlemoyer
M. Lewis
32
84
0
12 May 2023
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group
  Attention
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
Xinyu Liu
Houwen Peng
Ningxin Zheng
Yuqing Yang
Han Hu
Yixuan Yuan
ViT
25
277
0
11 May 2023
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health
  Management: A Survey and Roadmaps
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health Management: A Survey and Roadmaps
Yanfang Li
Huan Wang
Muxia Sun
LM&MA
AI4TS
AI4CE
29
46
0
10 May 2023
Toeplitz Neural Network for Sequence Modeling
Toeplitz Neural Network for Sequence Modeling
Zhen Qin
Xiaodong Han
Weixuan Sun
Bowen He
Dong Li
Dongxu Li
Yuchao Dai
Lingpeng Kong
Yiran Zhong
AI4TS
ViT
38
40
0
08 May 2023
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing
  Important Tokens
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Zhanpeng Zeng
Cole Hawkins
Min-Fong Hong
Aston Zhang
Nikolaos Pappas
Vikas Singh
Shuai Zheng
21
6
0
07 May 2023
Leveraging Synthetic Targets for Machine Translation
Leveraging Synthetic Targets for Machine Translation
Sarthak Mittal
Oleksii Hrinchuk
Oleksii Kuchaiev
35
2
0
07 May 2023
Online Gesture Recognition using Transformer and Natural Language
  Processing
Online Gesture Recognition using Transformer and Natural Language Processing
Guénolé Silvestre
F. Balado
O. Akinremi
Mirco Ramo
ViT
29
2
0
05 May 2023
The Role of Global and Local Context in Named Entity Recognition
The Role of Global and Local Context in Named Entity Recognition
Arthur Amalvy
Vincent Labatut
Richard Dufour
38
4
0
04 May 2023
BranchNorm: Robustly Scaling Extremely Deep Transformers
BranchNorm: Robustly Scaling Extremely Deep Transformers
Yanjun Liu
Xianfeng Zeng
Fandong Meng
Jie Zhou
32
3
0
04 May 2023
Sequence Modeling with Multiresolution Convolutional Memory
Sequence Modeling with Multiresolution Convolutional Memory
Jiaxin Shi
Ke Alexander Wang
E. Fox
42
13
0
02 May 2023
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Amanda Bertsch
Uri Alon
Graham Neubig
Matthew R. Gormley
RALM
116
122
0
02 May 2023
Accelerating Neural Self-Improvement via Bootstrapping
Accelerating Neural Self-Improvement via Bootstrapping
Kazuki Irie
Jürgen Schmidhuber
29
1
0
02 May 2023
Taming graph kernels with random features
Taming graph kernels with random features
K. Choromanski
32
12
0
29 Apr 2023
A Cookbook of Self-Supervised Learning
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
50
274
0
24 Apr 2023
Speed Is All You Need: On-Device Acceleration of Large Diffusion Models
  via GPU-Aware Optimizations
Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations
Yu-Hui Chen
Raman Sarokin
Juhyun Lee
Jiuqiang Tang
Chuo-Ling Chang
Andrei Kulik
Matthias Grundmann
VLM
45
38
0
21 Apr 2023
Improving Autoregressive NLP Tasks via Modular Linearized Attention
Improving Autoregressive NLP Tasks via Modular Linearized Attention
Victor Agostinelli
Lizhong Chen
27
1
0
17 Apr 2023
One-Class SVM on siamese neural network latent space for Unsupervised
  Anomaly Detection on brain MRI White Matter Hyperintensities
One-Class SVM on siamese neural network latent space for Unsupervised Anomaly Detection on brain MRI White Matter Hyperintensities
Nicolas Pinon
Robin Trombetta
Carole Lartizien
9
3
0
17 Apr 2023
Cross Attention Transformers for Multi-modal Unsupervised Whole-Body PET
  Anomaly Detection
Cross Attention Transformers for Multi-modal Unsupervised Whole-Body PET Anomaly Detection
Ashay Patel
Petru-Daniel Tudosiu
W. H. Pinaya
G. Cook
Vicky Goh
Sebastien Ourselin
M. Jorge Cardoso
OOD
ViT
MedIm
28
11
0
14 Apr 2023
Modeling Dense Multimodal Interactions Between Biological Pathways and
  Histology for Survival Prediction
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
Guillaume Jaume
Anurag J. Vaidya
Richard J. Chen
Drew F. K. Williamson
Paul Pu Liang
Faisal Mahmood
41
44
0
13 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
Devil's on the Edges: Selective Quad Attention for Scene Graph
  Generation
Devil's on the Edges: Selective Quad Attention for Scene Graph Generation
Deunsol Jung
Sanghyun Kim
Wonhui Kim
Minsu Cho
3DPC
GNN
27
32
0
07 Apr 2023
EGA-Depth: Efficient Guided Attention for Self-Supervised Multi-Camera
  Depth Estimation
EGA-Depth: Efficient Guided Attention for Self-Supervised Multi-Camera Depth Estimation
Y. Shi
H. Cai
Amin Ansari
Fatih Porikli
MDE
88
17
0
06 Apr 2023
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for
  Action Segmentation
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for Action Segmentation
Peiyao Wang
Haibin Ling
15
2
0
04 Apr 2023
Dialogue-Contextualized Re-ranking for Medical History-Taking
Dialogue-Contextualized Re-ranking for Medical History-Taking
Jian Zhu
Ilya Valmianski
Anitha Kannan
19
1
0
04 Apr 2023
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in
  Speech Recognition
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Saumya Yashmohini Sahai
Jing Liu
Thejaswi Muniyappa
Kanthashree Mysore Sathyendra
Anastasios Alexandridis
...
Ross McGowan
Ariya Rastrow
Feng-Ju Chang
Athanasios Mouchtaris
Siegfried Kunzmann
39
5
0
03 Apr 2023
Practical Conformer: Optimizing size, speed and flops of Conformer for
  on-Device and cloud ASR
Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR
Rami Botros
Anmol Gulati
Tara N. Sainath
K. Choromanski
Ruoming Pang
Trevor Strohman
Weiran Wang
Jiahui Yu
MQ
26
3
0
31 Mar 2023
Solving Regularized Exp, Cosh and Sinh Regression Problems
Solving Regularized Exp, Cosh and Sinh Regression Problems
Zhihang Li
Zhao Song
Dinesh Manocha
36
39
0
28 Mar 2023
Accelerating Trajectory Generation for Quadrotors Using Transformers
Accelerating Trajectory Generation for Quadrotors Using Transformers
Srinath Tankasala
Mitch Pryor
17
1
0
27 Mar 2023
Previous
123...91011...192021
Next