Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.04451
Cited By
Reformer: The Efficient Transformer
13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reformer: The Efficient Transformer"
50 / 505 papers shown
Title
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
SChuBERT: Scholarly Document Chunks with BERT-encoding boost Citation Count Prediction
Thomas van Dongen
Gideon Maillette de Buy Wenniger
Lambert Schomaker
27
24
0
21 Dec 2020
LieTransformer: Equivariant self-attention for Lie Groups
M. Hutchinson
Charline Le Lan
Sheheryar Zaidi
Emilien Dupont
Yee Whye Teh
Hyunjik Kim
31
111
0
20 Dec 2020
Rewriter-Evaluator Architecture for Neural Machine Translation
Yangming Li
Kaisheng Yao
16
2
0
10 Dec 2020
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
Huiyu Wang
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
52
527
0
01 Dec 2020
Multi-stage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images
Rui Li
Shunyi Zheng
Chenxi Duan
Jianlin Su
Ce Zhang
35
188
0
29 Nov 2020
ReAssert: Deep Learning for Assert Generation
Robert White
J. Krinke
34
14
0
19 Nov 2020
End-to-End Object Detection with Adaptive Clustering Transformer
Minghang Zheng
Peng Gao
Renrui Zhang
Kunchang Li
Xiaogang Wang
Hongsheng Li
Hao Dong
ViT
41
193
0
18 Nov 2020
Multi-document Summarization via Deep Learning Techniques: A Survey
Congbo Ma
W. Zhang
Mingyu Guo
Hu Wang
Quan Z. Sheng
13
126
0
10 Nov 2020
Detecting Hallucinated Content in Conditional Neural Sequence Generation
Chunting Zhou
Graham Neubig
Jiatao Gu
Mona T. Diab
P. Guzmán
Luke Zettlemoyer
Marjan Ghazvininejad
HILM
39
196
0
05 Nov 2020
Long Document Ranking with Query-Directed Sparse Transformer
Jyun-Yu Jiang
Chenyan Xiong
Chia-Jung Lee
Wei Wang
33
25
0
23 Oct 2020
Open Question Answering over Tables and Text
Wenhu Chen
Ming-Wei Chang
Eva Schlinger
Wenjie Wang
William W. Cohen
LMTD
RALM
31
194
0
20 Oct 2020
Rethinking Document-level Neural Machine Translation
Zewei Sun
Mingxuan Wang
Hao Zhou
Chengqi Zhao
Shujian Huang
Jiajun Chen
Lei Li
VLM
83
47
0
18 Oct 2020
Memformer: A Memory-Augmented Transformer for Sequence Modeling
Qingyang Wu
Zhenzhong Lan
Kun Qian
Jing Gu
A. Geramifard
Zhou Yu
22
49
0
14 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
244
612
0
13 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
31
44
0
11 Oct 2020
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
Z. Li
Hai Zhao
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
36
15
0
11 Oct 2020
An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels
Ilias Chalkidis
Manos Fergadiotis
Sotiris Kotitsas
Prodromos Malakasiotis
Nikolaos Aletras
Ion Androutsopoulos
VLM
AI4TS
28
84
0
04 Oct 2020
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
66
1,527
0
30 Sep 2020
Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems
Andrea Madotto
Samuel Cahyawijaya
Genta Indra Winata
Yan Xu
Zihan Liu
Zhaojiang Lin
Pascale Fung
42
59
0
28 Sep 2020
No Answer is Better Than Wrong Answer: A Reflection Model for Document Level Machine Reading Comprehension
Xuguang Wang
Linjun Shou
Ming Gong
Nan Duan
Daxin Jiang
24
12
0
25 Sep 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
51
956
0
15 Sep 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
114
1,104
0
14 Sep 2020
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Shuohang Wang
Luowei Zhou
Zhe Gan
Yen-Chun Chen
Yuwei Fang
S. Sun
Yu Cheng
Jingjing Liu
43
28
0
13 Sep 2020
Sparsifying Transformer Models with Trainable Representation Pooling
Michal Pietruszka
Łukasz Borchmann
Lukasz Garncarek
23
10
0
10 Sep 2020
AutoTrans: Automating Transformer Design via Reinforced Architecture Search
Wei-wei Zhu
Xiaoling Wang
Xipeng Qiu
Yuan Ni
Guotong Xie
32
18
0
04 Sep 2020
HiPPO: Recurrent Memory with Optimal Polynomial Projections
Albert Gu
Tri Dao
Stefano Ermon
Atri Rudra
Christopher Ré
54
492
0
17 Aug 2020
Compression of Deep Learning Models for Text: A Survey
Manish Gupta
Puneet Agrawal
VLM
MedIm
AI4CE
22
115
0
12 Aug 2020
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jingkai Li
D. Song
Jacob Steinhardt
63
522
0
05 Aug 2020
Conformer-Kernel with Query Term Independence for Document Retrieval
Bhaskar Mitra
Sebastian Hofstatter
Hamed Zamani
Nick Craswell
27
21
0
20 Jul 2020
S2RMs: Spatially Structured Recurrent Modules
Nasim Rahaman
Anirudh Goyal
Muhammad Waleed Gondal
M. Wuthrich
Stefan Bauer
Yash Sharma
Yoshua Bengio
Bernhard Schölkopf
21
14
0
13 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
Recurrent Quantum Neural Networks
Johannes Bausch
26
152
0
25 Jun 2020
Sparse GPU Kernels for Deep Learning
Trevor Gale
Matei A. Zaharia
C. Young
Erich Elsen
17
230
0
18 Jun 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
Jungo Kasai
Nikolaos Pappas
Hao Peng
James Cross
Noah A. Smith
41
134
0
18 Jun 2020
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
29
93
0
17 Jun 2020
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers
Tsung-Han Wu
Chun-Chen Hsieh
Yen-Hao Chen
Po-Han Chi
Hung-yi Lee
26
1
0
09 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
87
1,655
0
08 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
64
2,631
0
05 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
34
7
0
05 Jun 2020
Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Peter Hawkins
Jared Davis
David Belanger
Lucy J. Colwell
Adrian Weller
39
84
0
05 Jun 2020
General-Purpose User Embeddings based on Mobile App Usage
Junqi Zhang
Bing Bai
Ye Lin
Jian Liang
Kun Bai
Fei Wang
38
35
0
27 May 2020
FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval
D. Gao
Linbo Jin
Ben Chen
Minghui Qiu
Peng Li
Yi Wei
Yitao Hu
Haozhe Jasper Wang
OOD
25
133
0
20 May 2020
Multiresolution and Multimodal Speech Recognition with Transformers
Georgios Paraskevopoulos
Srinivas Parthasarathy
Aparna Khare
Shiva Sundaram
25
29
0
29 Apr 2020
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching
Liu Yang
Mingyang Zhang
Cheng Li
Michael Bendersky
Marc Najork
38
87
0
26 Apr 2020
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
Gaëtan Hadjeres
Léopold Crestel
34
18
0
21 Apr 2020
The Cost of Training NLP Models: A Concise Overview
Or Sharir
Barak Peleg
Y. Shoham
40
210
0
19 Apr 2020
Residual Attention U-Net for Automated Multi-Class Segmentation of COVID-19 Chest CT Images
Xiaocong Chen
Lina Yao
Yu Zhang
36
197
0
12 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
33
3,944
0
10 Apr 2020
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection
Xiaoya Li
Yuxian Meng
Mingxin Zhou
Qinghong Han
Fei Wu
Jiwei Li
27
20
0
22 Mar 2020
Previous
1
2
3
...
10
11
9
Next