ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.16236
  4. Cited By
Transformers are RNNs: Fast Autoregressive Transformers with Linear
  Attention

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

29 June 2020
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
ArXivPDFHTML

Papers citing "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention"

50 / 346 papers shown
Title
Diagonal State Spaces are as Effective as Structured State Spaces
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
57
291
0
27 Mar 2022
Keypoints Tracking via Transformer Networks
Keypoints Tracking via Transformer Networks
Oleksii Nasypanyi
François Rameau
ViT
38
0
0
24 Mar 2022
Linearizing Transformer with Key-Value Memory
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
20
5
0
23 Mar 2022
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through
  Regularized Self-Attention
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention
Yang Liu
Jiaxiang Liu
L. Chen
Yuxiang Lu
Shi Feng
Zhida Feng
Yu Sun
Hao Tian
Huancheng Wu
Hai-feng Wang
28
9
0
23 Mar 2022
Local-Global Context Aware Transformer for Language-Guided Video
  Segmentation
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
29
74
0
18 Mar 2022
Long Document Summarization with Top-down and Bottom-up Inference
Long Document Summarization with Top-down and Bottom-up Inference
Bo Pang
Erik Nijkamp
Wojciech Kry'sciñski
Silvio Savarese
Yingbo Zhou
Caiming Xiong
RALM
BDL
24
55
0
15 Mar 2022
Block-Recurrent Transformers
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
23
94
0
11 Mar 2022
Contextformer: A Transformer with Spatio-Channel Attention for Context
  Modeling in Learned Image Compression
Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression
A. B. Koyuncu
Han Gao
Atanas Boev
Georgii Gaikov
Elena Alshina
Eckehard Steinbach
ViT
39
68
0
04 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
36
22
0
28 Feb 2022
Bayesian Structure Learning with Generative Flow Networks
Bayesian Structure Learning with Generative Flow Networks
T. Deleu
António Góis
Chris C. Emezue
M. Rankawat
Simon Lacoste-Julien
Stefan Bauer
Yoshua Bengio
BDL
48
143
0
28 Feb 2022
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
Maksim Zubkov
Daniil Gavrilov
27
0
0
23 Feb 2022
Transformer Quality in Linear Time
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
78
222
0
21 Feb 2022
cosFormer: Rethinking Softmax in Attention
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
24
212
0
17 Feb 2022
Graph Masked Autoencoders with Transformers
Graph Masked Autoencoders with Transformers
Sixiao Zhang
Hongxu Chen
Haoran Yang
Xiangguo Sun
Philip S. Yu
Guandong Xu
21
18
0
17 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
27
34
0
14 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
The Dual Form of Neural Networks Revisited: Connecting Test Time
  Predictions to Training Patterns via Spotlights of Attention
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
14
42
0
11 Feb 2022
glassoformer: a query-sparse transformer for post-fault power grid
  voltage prediction
glassoformer: a query-sparse transformer for post-fault power grid voltage prediction
Yunling Zheng
Carson Hu
Guang Lin
Meng Yue
Bao Wang
Jack Xin
70
2
0
22 Jan 2022
Sparse Cross-scale Attention Network for Efficient LiDAR Panoptic
  Segmentation
Sparse Cross-scale Attention Network for Efficient LiDAR Panoptic Segmentation
Shuangjie Xu
Rui Wan
Maosheng Ye
Xiaoyi Zou
Tongyi Cao
3DPC
18
32
0
16 Jan 2022
QuadTree Attention for Vision Transformers
QuadTree Attention for Vision Transformers
Shitao Tang
Jiahui Zhang
Siyu Zhu
Ping Tan
ViT
169
156
0
08 Jan 2022
Low-Rank Constraints for Fast Inference in Structured Models
Low-Rank Constraints for Fast Inference in Structured Models
Justin T. Chiu
Yuntian Deng
Alexander M. Rush
BDL
32
13
0
08 Jan 2022
Classification of Long Sequential Data using Circular Dilated
  Convolutional Neural Networks
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Lei Cheng
Ruslan Khalitov
Tong Yu
Zhirong Yang
25
32
0
06 Jan 2022
SMDT: Selective Memory-Augmented Neural Document Translation
SMDT: Selective Memory-Augmented Neural Document Translation
Xu Zhang
Jian Yang
Haoyang Huang
Shuming Ma
Dongdong Zhang
Jinlong Li
Furu Wei
24
2
0
05 Jan 2022
Learning Operators with Coupled Attention
Learning Operators with Coupled Attention
Georgios Kissas
Jacob H. Seidman
Leonardo Ferreira Guilhoto
V. Preciado
George J. Pappas
P. Perdikaris
32
110
0
04 Jan 2022
Efficient Visual Tracking with Exemplar Transformers
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
21
80
0
17 Dec 2021
AdaViT: Adaptive Tokens for Efficient Vision Transformer
AdaViT: Adaptive Tokens for Efficient Vision Transformer
Hongxu Yin
Arash Vahdat
J. Álvarez
Arun Mallya
Jan Kautz
Pavlo Molchanov
ViT
35
314
0
14 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
31
3
0
10 Dec 2021
3D Medical Point Transformer: Introducing Convolution to Attention
  Networks for Medical Point Cloud Analysis
3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis
Jianhui Yu
Chaoyi Zhang
Heng Wang
Dingxin Zhang
Yang Song
Tiange Xiang
Dongnan Liu
Weidong (Tom) Cai
ViT
MedIm
21
32
0
09 Dec 2021
STJLA: A Multi-Context Aware Spatio-Temporal Joint Linear Attention
  Network for Traffic Forecasting
STJLA: A Multi-Context Aware Spatio-Temporal Joint Linear Attention Network for Traffic Forecasting
Yuchen Fang
Yanjun Qin
Haiyong Luo
Fang Zhao
Chenxing Wang
GNN
AI4TS
19
1
0
04 Dec 2021
Multi-View Stereo with Transformer
Multi-View Stereo with Transformer
Jie Zhu
Bo Peng
Wanqing Li
Haifeng Shen
Zhe Zhang
Jianjun Lei
ViT
31
27
0
01 Dec 2021
Sparse DETR: Efficient End-to-End Object Detection with Learnable
  Sparsity
Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity
Byungseok Roh
Jaewoong Shin
Wuhyun Shin
Saehoon Kim
ViT
13
142
0
29 Nov 2021
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically
  Structured Sequences
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
Moritz Ibing
Gregor Kobsik
Leif Kobbelt
28
37
0
24 Nov 2021
Video Background Music Generation with Controllable Music Transformer
Video Background Music Generation with Controllable Music Transformer
Shangzhe Di
Jiang
Sihan Liu
Zhaokai Wang
Leyan Zhu
Zexin He
Hongming Liu
Shuicheng Yan
19
91
0
16 Nov 2021
Theme Transformer: Symbolic Music Generation with Theme-Conditioned
  Transformer
Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer
Yi-Jen Shih
Shih-Lun Wu
Frank Zalkow
Meinard Muller
Yi-Hsuan Yang
35
76
0
07 Nov 2021
Efficiently Modeling Long Sequences with Structured State Spaces
Efficiently Modeling Long Sequences with Structured State Spaces
Albert Gu
Karan Goel
Christopher Ré
52
1,665
0
31 Oct 2021
SOFT: Softmax-free Transformer with Linear Complexity
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
18
161
0
22 Oct 2021
Transformer Acceleration with Dynamic Sparse Attention
Transformer Acceleration with Dynamic Sparse Attention
Liu Liu
Zheng Qu
Zhaodong Chen
Yufei Ding
Yuan Xie
19
20
0
21 Oct 2021
Sub-word Level Lip Reading With Visual Attention
Sub-word Level Lip Reading With Visual Attention
Prajwal K R
Triantafyllos Afouras
Andrew Zisserman
12
92
0
14 Oct 2021
MELONS: generating melody with long-term structure using transformers
  and structure graph
MELONS: generating melody with long-term structure using transformers and structure graph
Yi Zou
Pei Zou
Yi Zhao
Kai Zhang
Ran Zhang
Xiaorui Wang
MGen
30
33
0
11 Oct 2021
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using
  Mel-spectrograms
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms
Chien-Feng Liao
Jen-Yu Liu
Yi-Hsuan Yang
27
5
0
08 Oct 2021
Token Pooling in Vision Transformers
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
76
66
0
08 Oct 2021
ATISS: Autoregressive Transformers for Indoor Scene Synthesis
ATISS: Autoregressive Transformers for Indoor Scene Synthesis
Despoina Paschalidou
Amlan Kar
Maria Shugrina
Karsten Kreis
Andreas Geiger
Sanja Fidler
3DV
ViT
33
148
0
07 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
28
3
0
06 Oct 2021
Classification of hierarchical text using geometric deep learning: the
  case of clinical trials corpus
Classification of hierarchical text using geometric deep learning: the case of clinical trials corpus
Sohrab Ferdowsi
Nikolay Borissov
J. Knafou
P. Amini
Douglas Teodoro
16
7
0
04 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer
  Quantization
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
Do Long-Range Language Models Actually Use Long-Range Context?
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
25
80
0
19 Sep 2021
Pose Transformers (POTR): Human Motion Prediction with
  Non-Autoregressive Transformers
Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers
Ángel Martínez-González
M. Villamizar
J. Odobez
ViT
13
69
0
15 Sep 2021
Anchor DETR: Query Design for Transformer-Based Object Detection
Anchor DETR: Query Design for Transformer-Based Object Detection
Yingming Wang
Xinming Zhang
Tong Yang
Jian Sun
ViT
16
53
0
15 Sep 2021
PnP-DETR: Towards Efficient Visual Analysis with Transformers
PnP-DETR: Towards Efficient Visual Analysis with Transformers
Tao Wang
Li Yuan
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
ViT
24
82
0
15 Sep 2021
Previous
1234567
Next