Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.04451
Cited By
Reformer: The Efficient Transformer
13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reformer: The Efficient Transformer"
50 / 505 papers shown
Title
PuzzleFusion: Unleashing the Power of Diffusion Models for Spatial Puzzle Solving
Sepidehsadat Hosseini
M. Shabani
Saghar Irandoust
Yasutaka Furukawa
DiffM
38
12
0
24 Nov 2022
RNTrajRec: Road Network Enhanced Trajectory Recovery with Spatial-Temporal Transformer
Yuqi Chen
Hanyuan Zhang
Weiwei Sun
B. Zheng
36
39
0
23 Nov 2022
Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention
Zineng Tang
Jaemin Cho
Jie Lei
Joey Tianyi Zhou
VLM
24
9
0
21 Nov 2022
SeDR: Segment Representation Learning for Long Documents Dense Retrieval
Junying Chen
Qingcai Chen
Dongfang Li
Yutao Huang
28
6
0
20 Nov 2022
DeepParliament: A Legal domain Benchmark & Dataset for Parliament Bills Prediction
Ankit Pal
AILaw
28
0
0
15 Nov 2022
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention
Wenhao Li
Xiaoyuan Yi
Jinyi Hu
Maosong Sun
Xing Xie
44
0
0
14 Nov 2022
Discharge Summary Hospital Course Summarisation of In Patient Electronic Health Record Text with Clinical Concept Guided Deep Pre-Trained Transformer Models
Thomas Searle
Zina M. Ibrahim
J. Teo
Richard J. B. Dobson
21
29
0
14 Nov 2022
HigeNet: A Highly Efficient Modeling for Long Sequence Time Series Prediction in AIOps
Jiajia Li
Feng Tan
Cheng He
Zikai Wang
Haitao Song
Lingfei Wu
Pengwei Hu
28
0
0
13 Nov 2022
Equivariance with Learned Canonicalization Functions
Sekouba Kaba
Arnab Kumar Mondal
Yan Zhang
Yoshua Bengio
Siamak Ravanbakhsh
49
64
0
11 Nov 2022
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Jyotikrishna Dass
Shang Wu
Huihong Shi
Chaojian Li
Zhifan Ye
Zhongfeng Wang
Yingyan Lin
20
53
0
09 Nov 2022
Efficiently Scaling Transformer Inference
Reiner Pope
Sholto Douglas
Aakanksha Chowdhery
Jacob Devlin
James Bradbury
Anselm Levskaya
Jonathan Heek
Kefan Xiao
Shivani Agrawal
J. Dean
48
300
0
09 Nov 2022
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
Michael Hassid
Hao Peng
Daniel Rotem
Jungo Kasai
Ivan Montero
Noah A. Smith
Roy Schwartz
32
25
0
07 Nov 2022
How Far are We from Robust Long Abstractive Summarization?
Huan Yee Koh
Jiaxin Ju
He Zhang
Ming Liu
Shirui Pan
HILM
33
39
0
30 Oct 2022
Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost
Sungjun Cho
Seonwoo Min
Jinwoo Kim
Moontae Lee
Honglak Lee
Seunghoon Hong
42
3
0
27 Oct 2022
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal Data
Huy Hoang Nguyen
Matthew B. Blaschko
S. Saarakkala
A. Tiulpin
MedIm
AI4CE
53
15
0
25 Oct 2022
How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
Samuel Cahyawijaya
Bryan Wilie
Holy Lovenia
Huang Zhong
Mingqian Zhong
Yuk-Yu Nancy Ip
Pascale Fung
LM&MA
28
2
0
25 Oct 2022
Graphically Structured Diffusion Models
Christian D. Weilbach
William Harvey
Frank Wood
DiffM
40
7
0
20 Oct 2022
An efficient graph generative model for navigating ultra-large combinatorial synthesis libraries
Aryan Pedawi
P. Gniewek
Chao-Ling Chang
Brandon M. Anderson
H. V. D. Bedem
41
5
0
19 Oct 2022
Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation
Botao Yu
Peiling Lu
Rui Wang
Wei Hu
Xu Tan
Wei Ye
Shikun Zhang
Tao Qin
Tie-Yan Liu
MGen
35
55
0
19 Oct 2022
The Devil in Linear Transformer
Zhen Qin
Xiaodong Han
Weixuan Sun
Dongxu Li
Lingpeng Kong
Nick Barnes
Yiran Zhong
36
70
0
19 Oct 2022
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
Haoran You
Zhanyi Sun
Huihong Shi
Zhongzhi Yu
Yang Katie Zhao
Yongan Zhang
Chaojian Li
Baopu Li
Yingyan Lin
ViT
25
81
0
18 Oct 2022
What Makes Convolutional Models Great on Long Sequence Modeling?
Yuhong Li
Tianle Cai
Yi Zhang
De-huai Chen
Debadeepta Dey
VLM
39
96
0
17 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
46
9
0
14 Oct 2022
An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification
Ilias Chalkidis
Xiang Dai
Manos Fergadiotis
Prodromos Malakasiotis
Desmond Elliott
44
34
0
11 Oct 2022
Hierarchical3D Adapters for Long Video-to-text Summarization
Pinelopi Papalampidi
Mirella Lapata
VGen
36
12
0
10 Oct 2022
Bird-Eye Transformers for Text Generation Models
Lei Sha
Yuhang Song
Yordan Yordanov
Tommaso Salvatori
Thomas Lukasiewicz
30
0
0
08 Oct 2022
KG-MTT-BERT: Knowledge Graph Enhanced BERT for Multi-Type Medical Text Classification
Yong He
Cheng Wang
Shun Zhang
Na Li
Zhao Li
Zhenyu Zeng
AI4MH
44
10
0
08 Oct 2022
Edge-Varying Fourier Graph Networks for Multivariate Time Series Forecasting
Kun Yi
Qi Zhang
Liang Hu
Hui He
Ning An
LongBing Cao
ZhenDong Niu
AI4TS
59
3
0
06 Oct 2022
TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis
Haixu Wu
Teng Hu
Yong Liu
Hang Zhou
Jianmin Wang
Mingsheng Long
AI4TS
AIFin
61
715
0
05 Oct 2022
Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective
Peter Baumgartner
Daniel V. Smith
Mashud Rana
Reena Kapoor
Elena Tartaglia
A. Schutt
Ashfaqur Rahman
John Taylor
S. Dunstall
27
4
0
04 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
45
26
0
03 Oct 2022
Grouped self-attention mechanism for a memory-efficient Transformer
Bumjun Jung
Yusuke Mukuta
Tatsuya Harada
AI4TS
14
3
0
02 Oct 2022
Spiking Neural Networks for event-based action recognition: A new task to understand their advantage
Alex Vicente-Sola
D. L. Manna
Paul Kirkland
G. D. Caterina
Trevor Bihl
26
8
0
29 Sep 2022
Searching a High-Performance Feature Extractor for Text Recognition Network
Hui Zhang
Quanming Yao
James T. Kwok
X. Bai
30
7
0
27 Sep 2022
Explainable Graph Pyramid Autoformer for Long-Term Traffic Forecasting
Weiheng Zhong
Tanwi Mallick
Hadi Meidani
Jane Macfarlane
Prasanna Balaprakash
AI4TS
31
5
0
27 Sep 2022
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
107
98
0
26 Sep 2022
Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion
Xiaodong Yi
Shiwei Zhang
Lansong Diao
Chuan Wu
Zhen Zheng
Shiqing Fan
Siyu Wang
Jun Yang
W. Lin
44
4
0
26 Sep 2022
Efficient Long Sequential User Data Modeling for Click-Through Rate Prediction
Qiwei Chen
Yue Xu
Changhua Pei
Shanshan Lv
Tao Zhuang
Junfeng Ge
3DV
23
3
0
25 Sep 2022
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
38
183
0
21 Sep 2022
Adapting Pretrained Text-to-Text Models for Long Text Sequences
Wenhan Xiong
Anchit Gupta
Shubham Toshniwal
Yashar Mehdad
Wen-tau Yih
RALM
VLM
62
30
0
21 Sep 2022
An Efficient End-to-End Transformer with Progressive Tri-modal Attention for Multi-modal Emotion Recognition
Yang Wu
Pai Peng
Zhenyu Zhang
Yanyan Zhao
Bing Qin
32
1
0
20 Sep 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
40
58
0
20 Sep 2022
Graph Reasoning Transformer for Image Parsing
Dong Zhang
Jinhui Tang
Kwang-Ting Cheng
ViT
24
16
0
20 Sep 2022
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans
John J. Nay
ELM
AILaw
88
27
0
14 Sep 2022
SkIn: Skimming-Intensive Long-Text Classification Using BERT for Medical Corpus
Yufeng Zhao
Haiying Che
VLM
26
0
0
13 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
35
109
0
31 Aug 2022
Spatio-Temporal Wind Speed Forecasting using Graph Networks and Novel Transformer Architectures
Lars Odegaard Bentsen
N. Warakagoda
R. Stenbro
P. Engelstad
AI4TS
29
99
0
29 Aug 2022
Survey: Exploiting Data Redundancy for Optimization of Deep Learning
Jou-An Chen
Wei Niu
Bin Ren
Yanzhi Wang
Xipeng Shen
23
24
0
29 Aug 2022
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
26
2
0
11 Aug 2022
Investigating Efficiently Extending Transformers for Long Input Summarization
Jason Phang
Yao-Min Zhao
Peter J. Liu
RALM
LLMAG
42
63
0
08 Aug 2022
Previous
1
2
3
4
5
6
...
9
10
11
Next