Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.16236
Cited By
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
29 June 2020
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention"
50 / 346 papers shown
Title
Bird-Eye Transformers for Text Generation Models
Lei Sha
Yuhang Song
Yordan Yordanov
Tommaso Salvatori
Thomas Lukasiewicz
30
0
0
08 Oct 2022
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
Kazuki Irie
Jürgen Schmidhuber
37
5
0
07 Oct 2022
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability
Yufan Zhuang
Zihan Wang
Fangbo Tao
Jingbo Shang
ViT
AI4TS
35
3
0
05 Oct 2022
Transformer Meets Boundary Value Inverse Problems
Ruchi Guo
Shuhao Cao
Long Chen
MedIm
36
21
0
29 Sep 2022
Lightweight Monocular Depth Estimation with an Edge Guided Network
Xingshuai Dong
Matthew A. Garratt
S. Anavatti
H. Abbass
Junyu Dong
MDE
25
2
0
29 Sep 2022
Effective General-Domain Data Inclusion for the Machine Translation Task by Vanilla Transformers
H. Soliman
32
0
0
28 Sep 2022
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
107
95
0
26 Sep 2022
From One to Many: Dynamic Cross Attention Networks for LiDAR and Camera Fusion
Rui Wan
Shuangjie Xu
Wei Wu
Xiaoyi Zou
Tongyi Cao
3DPC
20
4
0
25 Sep 2022
Hand Hygiene Assessment via Joint Step Segmentation and Key Action Scorer
Chenglong Li
Qiwen Zhu
Tubiao Liu
Jin Tang
Yu Su
32
1
0
25 Sep 2022
Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence
Sunghwan Hong
Seokju Cho
Seung Wook Kim
Stephen Lin
3DV
42
4
0
19 Sep 2022
Quantum Vision Transformers
El Amine Cherrat
Iordanis Kerenidis
Natansh Mathur
Jonas Landman
M. Strahm
Yun. Y Li
ViT
34
55
0
16 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
30
109
0
31 Aug 2022
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
42
6
0
30 Aug 2022
Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
A. Andrusenko
R. Nasretdinov
A. Romanenko
20
18
0
16 Aug 2022
Controlling Perceived Emotion in Symbolic Music Generation with Monte Carlo Tree Search
Lucas N. Ferreira
Lili Mou
Jim Whitehead
Levi H. S. Lelis
23
17
0
10 Aug 2022
SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences
Peng Qi
Guangtao Wang
Jing Huang
24
0
0
03 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
32
9
0
01 Aug 2022
Neural Architecture Search on Efficient Transformers and Beyond
Zexiang Liu
Dong Li
Kaiyue Lu
Zhen Qin
Weixuan Sun
Jiacheng Xu
Yiran Zhong
35
19
0
28 Jul 2022
3D Siamese Transformer Network for Single Object Tracking on Point Clouds
Le Hui
Lingpeng Wang
Ling-Yu Tang
Kaihao Lan
Jin Xie
Jian Yang
ViT
3DPC
31
59
0
25 Jul 2022
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation
Sunghwan Hong
Seokju Cho
Jisu Nam
Stephen Lin
Seung Wook Kim
ViT
24
122
0
22 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
21
35
0
13 Jul 2022
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
43
189
0
06 Jul 2022
CTrGAN: Cycle Transformers GAN for Gait Transfer
Shahar Mahpod
Noam Gaash
Hay Hoffman
Gil Ben-Artzi
ViT
28
1
0
30 Jun 2022
Deformable Graph Transformer
Jinyoung Park
Seongjun Yun
Hyeon-ju Park
Jaewoo Kang
Jisu Jeong
KyungHyun Kim
Jung-Woo Ha
Hyunwoo J. Kim
90
7
0
29 Jun 2022
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
37
231
0
27 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
21
25
0
17 Jun 2022
Online Segmentation of LiDAR Sequences: Dataset and Algorithm
Romain Loiseau
Mathieu Aubry
Loïc Landrieu
3DPC
24
15
0
16 Jun 2022
Recurrent Transformer Variational Autoencoders for Multi-Action Motion Synthesis
Rania Briq
Chuhang Zou
L. Pishchulin
Christopher Broaddus
Juergen Gall
24
1
0
14 Jun 2022
Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules
Kazuki Irie
Francesco Faccio
Jürgen Schmidhuber
AI4TS
35
11
0
03 Jun 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
19
5
0
01 Jun 2022
Chefs' Random Tables: Non-Trigonometric Random Features
Valerii Likhosherstov
K. Choromanski
Kumar Avinava Dubey
Frederick Liu
Tamás Sarlós
Adrian Weller
33
17
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
118
17
0
30 May 2022
COFS: Controllable Furniture layout Synthesis
W. Para
Paul Guerrero
Niloy Mitra
Peter Wonka
3DV
42
16
0
29 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
78
2,024
0
27 May 2022
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
239
128
0
25 May 2022
OnePose: One-Shot Object Pose Estimation without CAD Models
Jiaming Sun
Zihao Wang
Siyu Zhang
Xingyi He He
Hongcheng Zhao
Guofeng Zhang
Xiaowei Zhou
98
148
0
24 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
44
65
0
20 May 2022
FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction
Zhenpei Yang
Zhile Ren
Miguel Angel Bautista
Zaiwei Zhang
Qi Shan
Qi-Xing Huang
3DH
30
24
0
16 May 2022
Symphony Generation with Permutation Invariant Language Model
Jiafeng Liu
Yuanliang Dong
Zehua Cheng
Xinran Zhang
Xiaobing Li
Feng Yu
Maosong Sun
21
39
0
10 May 2022
Sequencer: Deep LSTM for Image Classification
Yuki Tatsunami
Masato Taki
VLM
ViT
16
78
0
04 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
44
149
0
27 Apr 2022
Context-Aware Sequence Alignment using 4D Skeletal Augmentation
Taein Kwon
Bugra Tekin
Siyu Tang
Marc Pollefeys
33
13
0
26 Apr 2022
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention
Tong Yu
Ruslan Khalitov
Lei Cheng
Zhirong Yang
MoE
27
10
0
22 Apr 2022
Efficient Linear Attention for Fast and Accurate Keypoint Matching
Suwichaya Suwanwimolkul
S. Komorita
3DPC
3DV
19
11
0
16 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
Accelerating Attention through Gradient-Based Learned Runtime Pruning
Zheng Li
Soroush Ghodrati
Amir Yazdanbakhsh
H. Esmaeilzadeh
Mingu Kang
21
17
0
07 Apr 2022
Long Movie Clip Classification with State-Space Video Models
Md. Mohaiminul Islam
Gedas Bertasius
VLM
43
102
0
04 Apr 2022
InstaFormer: Instance-Aware Image-to-Image Translation with Transformer
Soohyun Kim
Jongbeom Baek
Jihye Park
Gyeongnyeon Kim
Seung Wook Kim
ViT
39
47
0
30 Mar 2022
REGTR: End-to-end Point Cloud Correspondences with Transformers
Zi Jian Yew
Gim Hee Lee
3DPC
ViT
35
172
0
28 Mar 2022
Previous
1
2
3
4
5
6
7
Next