Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.04451
Cited By
Reformer: The Efficient Transformer
13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reformer: The Efficient Transformer"
50 / 505 papers shown
Title
TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Huazhang Hu
Sixun Dong
Yiqun Zhao
Dongze Lian
Zhengxin Li
Shenghua Gao
26
47
0
03 Apr 2022
Deformable Video Transformer
Jue Wang
Lorenzo Torresani
ViT
30
28
0
31 Mar 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
35
94
0
30 Mar 2022
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection
Xin Huang
A. Khetan
Rene Bidart
Zohar Karnin
21
14
0
27 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
59
293
0
27 Mar 2022
A Survey on Aspect-Based Sentiment Classification
Gianni Brauwers
Flavius Frasincar
LLMAG
39
110
0
27 Mar 2022
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
31
298
0
27 Mar 2022
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
30
5
0
23 Mar 2022
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention
Yang Liu
Jiaxiang Liu
L. Chen
Yuxiang Lu
Shi Feng
Zhida Feng
Yu Sun
Hao Tian
Huancheng Wu
Hai-feng Wang
31
9
0
23 Mar 2022
Towards Abstractive Grounded Summarization of Podcast Transcripts
Kaiqiang Song
Chen Li
Xiaoyang Wang
Dong Yu
Fei Liu
35
9
0
22 Mar 2022
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
30
173
0
16 Mar 2022
Long Document Summarization with Top-down and Bottom-up Inference
Bo Pang
Erik Nijkamp
Wojciech Kry'sciñski
Silvio Savarese
Yingbo Zhou
Caiming Xiong
RALM
BDL
24
55
0
15 Mar 2022
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
28
94
0
11 Mar 2022
DuMLP-Pin: A Dual-MLP-dot-product Permutation-invariant Network for Set Feature Extraction
Jiajun Fei
Ziyu Zhu
Wenlei Liu
Zhidong Deng
Mingyang Li
Huanjun Deng
Shuo Zhang
3DPC
10
6
0
08 Mar 2022
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform
Carmelo Scribano
Giorgia Franchini
M. Prato
Marko Bertogna
18
21
0
02 Mar 2022
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
Shenggan Cheng
Xuanlei Zhao
Guangyang Lu
Bin-Rui Li
Zhongming Yu
Tian Zheng
R. Wu
Xiwen Zhang
Jian Peng
Yang You
AI4CE
27
30
0
02 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
36
22
0
28 Feb 2022
A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting
Benhan Li
Shengdong Du
Tianrui Li
AI4TS
28
2
0
23 Feb 2022
Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting
Dazhao Du
Fuchun Sun
Zhewei Wei
AI4TS
23
43
0
23 Feb 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
81
221
0
21 Feb 2022
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
38
212
0
17 Feb 2022
Graph Masked Autoencoders with Transformers
Sixiao Zhang
Hongxu Chen
Haoran Yang
Xiangguo Sun
Philip S. Yu
Guandong Xu
21
18
0
17 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
29
34
0
14 Feb 2022
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi
Y. Carmon
Jonathan Berant
21
17
0
13 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
31
15
0
11 Feb 2022
Exploiting Spatial Sparsity for Event Cameras with Visual Transformers
Zuowen Wang
Yuhuang Hu
Shih-Chii Liu
ViT
36
33
0
10 Feb 2022
Particle Transformer for Jet Tagging
H. Qu
Congqiao Li
Sitian Qian
ViT
MedIm
24
97
0
08 Feb 2022
ETSformer: Exponential Smoothing Transformers for Time-series Forecasting
Gerald Woo
Chenghao Liu
Doyen Sahoo
Akshat Kumar
Guosheng Lin
AI4TS
33
162
0
03 Feb 2022
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
38
27
0
31 Jan 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Hyunjun Kim
Jeonggil Ko
17
2
0
30 Jan 2022
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Tian Zhou
Ziqing Ma
Qingsong Wen
Xue Wang
Liang Sun
Rong Jin
AI4TS
30
1,317
0
30 Jan 2022
Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences
Yikuan Li
R. M. Wehbe
F. Ahmad
Hanyin Wang
Yuan Luo
VLM
MedIm
150
86
0
27 Jan 2022
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Herbert Ullrich
Jan Drchal
Martin Rýpar
Hana Vincourová
Václav Moravec
HILM
30
9
0
26 Jan 2022
glassoformer: a query-sparse transformer for post-fault power grid voltage prediction
Yunling Zheng
Carson Hu
Guang Lin
Meng Yue
Bao Wang
Jack Xin
76
3
0
22 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
48
261
0
21 Jan 2022
A Literature Survey of Recent Advances in Chatbots
Guendalina Caldarini
Sardar F. Jaf
K. McGarry
AI4CE
40
274
0
17 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
27
11
0
17 Jan 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
24
103
0
16 Jan 2022
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
57
215
0
14 Jan 2022
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Lei Cheng
Ruslan Khalitov
Tong Yu
Zhirong Yang
25
32
0
06 Jan 2022
ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer
Yue Ju
Alka Isac
Yimin Nie
AI4TS
16
3
0
30 Dec 2021
A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting
Guanyao Li
Shuhan Zhong
Shueng-Han Gary Chan
Ruiyuan Li
Chih-Chieh Hung
Wen-Chih Peng
AI4TS
40
26
0
30 Dec 2021
ELSA: Enhanced Local Self-Attention for Vision Transformer
Jingkai Zhou
Pichao Wang
Fan Wang
Qiong Liu
Hao Li
Rong Jin
ViT
37
37
0
23 Dec 2021
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar
Enamul Hoque
J. Huang
45
45
0
22 Dec 2021
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
21
80
0
17 Dec 2021
Block-Skim: Efficient Question Answering for Transformer
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
Yuhao Zhu
27
30
0
16 Dec 2021
AdaViT: Adaptive Tokens for Efficient Vision Transformer
Hongxu Yin
Arash Vahdat
J. Álvarez
Arun Mallya
Jan Kautz
Pavlo Molchanov
ViT
35
320
0
14 Dec 2021
Self-attention Does Not Need
O
(
n
2
)
O(n^2)
O
(
n
2
)
Memory
M. Rabe
Charles Staats
LRM
26
142
0
10 Dec 2021
Previous
1
2
3
...
10
11
6
7
8
9
Next