ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.04451
  4. Cited By
Reformer: The Efficient Transformer

Reformer: The Efficient Transformer

13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
    VLM
ArXivPDFHTML

Papers citing "Reformer: The Efficient Transformer"

50 / 505 papers shown
Title
TransRAC: Encoding Multi-scale Temporal Correlation with Transformers
  for Repetitive Action Counting
TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Huazhang Hu
Sixun Dong
Yiqun Zhao
Dongze Lian
Zhengxin Li
Shenghua Gao
26
47
0
03 Apr 2022
Deformable Video Transformer
Deformable Video Transformer
Jue Wang
Lorenzo Torresani
ViT
30
28
0
31 Mar 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
35
94
0
30 Mar 2022
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token
  Selection
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection
Xin Huang
A. Khetan
Rene Bidart
Zohar Karnin
21
14
0
27 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
59
293
0
27 Mar 2022
A Survey on Aspect-Based Sentiment Classification
A Survey on Aspect-Based Sentiment Classification
Gianni Brauwers
Flavius Frasincar
LLMAG
39
110
0
27 Mar 2022
A General Survey on Attention Mechanisms in Deep Learning
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
31
298
0
27 Mar 2022
Linearizing Transformer with Key-Value Memory
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
30
5
0
23 Mar 2022
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through
  Regularized Self-Attention
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention
Yang Liu
Jiaxiang Liu
L. Chen
Yuxiang Lu
Shi Feng
Zhida Feng
Yu Sun
Hao Tian
Huancheng Wu
Hai-feng Wang
31
9
0
23 Mar 2022
Towards Abstractive Grounded Summarization of Podcast Transcripts
Towards Abstractive Grounded Summarization of Podcast Transcripts
Kaiqiang Song
Chen Li
Xiaoyang Wang
Dong Yu
Fei Liu
35
9
0
22 Mar 2022
Memorizing Transformers
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
30
173
0
16 Mar 2022
Long Document Summarization with Top-down and Bottom-up Inference
Long Document Summarization with Top-down and Bottom-up Inference
Bo Pang
Erik Nijkamp
Wojciech Kry'sciñski
Silvio Savarese
Yingbo Zhou
Caiming Xiong
RALM
BDL
24
55
0
15 Mar 2022
Block-Recurrent Transformers
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
28
94
0
11 Mar 2022
DuMLP-Pin: A Dual-MLP-dot-product Permutation-invariant Network for Set
  Feature Extraction
DuMLP-Pin: A Dual-MLP-dot-product Permutation-invariant Network for Set Feature Extraction
Jiajun Fei
Ziyu Zhu
Wenlei Liu
Zhidong Deng
Mingyang Li
Huanjun Deng
Shuo Zhang
3DPC
10
6
0
08 Mar 2022
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform
DCT-Former: Efficient Self-Attention with Discrete Cosine Transform
Carmelo Scribano
Giorgia Franchini
M. Prato
Marko Bertogna
18
21
0
02 Mar 2022
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
Shenggan Cheng
Xuanlei Zhao
Guangyang Lu
Bin-Rui Li
Zhongming Yu
Tian Zheng
R. Wu
Xiwen Zhang
Jian Peng
Yang You
AI4CE
27
30
0
02 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
36
22
0
28 Feb 2022
A Differential Attention Fusion Model Based on Transformer for Time
  Series Forecasting
A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting
Benhan Li
Shengdong Du
Tianrui Li
AI4TS
28
2
0
23 Feb 2022
Preformer: Predictive Transformer with Multi-Scale Segment-wise
  Correlations for Long-Term Time Series Forecasting
Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting
Dazhao Du
Fuchun Sun
Zhewei Wei
AI4TS
23
43
0
23 Feb 2022
Transformer Quality in Linear Time
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
81
221
0
21 Feb 2022
cosFormer: Rethinking Softmax in Attention
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
38
212
0
17 Feb 2022
Graph Masked Autoencoders with Transformers
Graph Masked Autoencoders with Transformers
Sixiao Zhang
Hongxu Chen
Haoran Yang
Xiangguo Sun
Philip S. Yu
Guandong Xu
21
18
0
17 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
29
34
0
14 Feb 2022
Scaling Laws Under the Microscope: Predicting Transformer Performance
  from Small Scale Experiments
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi
Y. Carmon
Jonathan Berant
21
17
0
13 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient
  Image Captioning
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
31
15
0
11 Feb 2022
Exploiting Spatial Sparsity for Event Cameras with Visual Transformers
Exploiting Spatial Sparsity for Event Cameras with Visual Transformers
Zuowen Wang
Yuhuang Hu
Shih-Chii Liu
ViT
36
33
0
10 Feb 2022
Particle Transformer for Jet Tagging
Particle Transformer for Jet Tagging
H. Qu
Congqiao Li
Sitian Qian
ViT
MedIm
24
97
0
08 Feb 2022
ETSformer: Exponential Smoothing Transformers for Time-series
  Forecasting
ETSformer: Exponential Smoothing Transformers for Time-series Forecasting
Gerald Woo
Chenghao Liu
Doyen Sahoo
Akshat Kumar
Guosheng Lin
AI4TS
33
162
0
03 Feb 2022
BOAT: Bilateral Local Attention Vision Transformer
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
38
27
0
31 Jan 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Fast Monte-Carlo Approximation of the Attention Mechanism
Hyunjun Kim
Jeonggil Ko
17
2
0
30 Jan 2022
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term
  Series Forecasting
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Tian Zhou
Ziqing Ma
Qingsong Wen
Xue Wang
Liang Sun
Rong Jin
AI4TS
30
1,317
0
30 Jan 2022
Clinical-Longformer and Clinical-BigBird: Transformers for long clinical
  sequences
Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences
Yikuan Li
R. M. Wehbe
F. Ahmad
Hanyin Wang
Yuan Luo
VLM
MedIm
150
86
0
27 Jan 2022
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Herbert Ullrich
Jan Drchal
Martin Rýpar
Hana Vincourová
Václav Moravec
HILM
30
9
0
26 Jan 2022
glassoformer: a query-sparse transformer for post-fault power grid
  voltage prediction
glassoformer: a query-sparse transformer for post-fault power grid voltage prediction
Yunling Zheng
Carson Hu
Guang Lin
Meng Yue
Bao Wang
Jack Xin
76
3
0
22 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global
  Attention
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
48
261
0
21 Jan 2022
A Literature Survey of Recent Advances in Chatbots
A Literature Survey of Recent Advances in Chatbots
Guendalina Caldarini
Sardar F. Jaf
K. McGarry
AI4CE
40
274
0
17 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
27
11
0
17 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
24
103
0
16 Jan 2022
A Survey of Controllable Text Generation using Transformer-based
  Pre-trained Language Models
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
57
215
0
14 Jan 2022
Classification of Long Sequential Data using Circular Dilated
  Convolutional Neural Networks
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Lei Cheng
Ruslan Khalitov
Tong Yu
Zhirong Yang
25
32
0
06 Jan 2022
ChunkFormer: Learning Long Time Series with Multi-stage Chunked
  Transformer
ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer
Yue Ju
Alka Isac
Yimin Nie
AI4TS
16
3
0
30 Dec 2021
A Lightweight and Accurate Spatial-Temporal Transformer for Traffic
  Forecasting
A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting
Guanyao Li
Shuhan Zhong
Shueng-Han Gary Chan
Ruiyuan Li
Chih-Chieh Hung
Wen-Chih Peng
AI4TS
40
26
0
30 Dec 2021
ELSA: Enhanced Local Self-Attention for Vision Transformer
ELSA: Enhanced Local Self-Attention for Vision Transformer
Jingkai Zhou
Pichao Wang
Fan Wang
Qiong Liu
Hao Li
Rong Jin
ViT
37
37
0
23 Dec 2021
Domain Adaptation with Pre-trained Transformers for Query Focused
  Abstractive Text Summarization
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar
Enamul Hoque
J. Huang
45
45
0
22 Dec 2021
Efficient Visual Tracking with Exemplar Transformers
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
21
80
0
17 Dec 2021
Block-Skim: Efficient Question Answering for Transformer
Block-Skim: Efficient Question Answering for Transformer
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
Yuhao Zhu
27
30
0
16 Dec 2021
AdaViT: Adaptive Tokens for Efficient Vision Transformer
AdaViT: Adaptive Tokens for Efficient Vision Transformer
Hongxu Yin
Arash Vahdat
J. Álvarez
Arun Mallya
Jan Kautz
Pavlo Molchanov
ViT
35
320
0
14 Dec 2021
Self-attention Does Not Need $O(n^2)$ Memory
Self-attention Does Not Need O(n2)O(n^2)O(n2) Memory
M. Rabe
Charles Staats
LRM
26
142
0
10 Dec 2021
Previous
123...10116789
Next