Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.14062
Cited By
Big Bird: Transformers for Longer Sequences
28 July 2020
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
Santiago Ontanon
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Big Bird: Transformers for Longer Sequences"
45 / 345 papers shown
Title
Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff
Michael J. Naylor
C. French
Samantha R. Terker
Uday Kamath
36
10
0
12 Jul 2021
Can Deep Neural Networks Predict Data Correlations from Column Names?
Immanuel Trummer
17
8
0
09 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
42
428
0
01 Jul 2021
OadTR: Online Action Detection with Transformers
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Yuanjie Shao
Zhe Zuo
Changxin Gao
Nong Sang
OffRL
ViT
34
109
0
21 Jun 2021
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
33
497
0
17 Jun 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
37
815
0
14 Jun 2021
Thinking Like Transformers
Gail Weiss
Yoav Goldberg
Eran Yahav
AI4CE
29
127
0
13 Jun 2021
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
27
124
0
10 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
32
1,087
0
08 Jun 2021
On the Expressive Power of Self-Attention Matrices
Valerii Likhosherstov
K. Choromanski
Adrian Weller
35
33
0
07 Jun 2021
Defending Against Backdoor Attacks in Natural Language Generation
Xiaofei Sun
Xiaoya Li
Yuxian Meng
Xiang Ao
Fei Wu
Jiwei Li
Tianwei Zhang
AAML
SILM
26
47
0
03 Jun 2021
KVT: k-NN Attention for Boosting Vision Transformers
Pichao Wang
Xue Wang
F. Wang
Ming Lin
Shuning Chang
Hao Li
R. L. Jin
ViT
48
105
0
28 May 2021
Should We Trust This Summary? Bayesian Abstractive Summarization to The Rescue
Alexios Gidiotis
Grigorios Tsoumakas
UQCV
UD
BDL
22
9
0
21 May 2021
BookSum: A Collection of Datasets for Long-form Narrative Summarization
Wojciech Kry'sciñski
Nazneen Rajani
Divyansh Agarwal
Caiming Xiong
Dragomir R. Radev
RALM
19
145
0
18 May 2021
Recent Advances in Deep Learning Based Dialogue Systems: A Systematic Survey
Jinjie Ni
Tom Young
Vlad Pandelea
Fuzhao Xue
Erik Cambria
54
267
0
10 May 2021
Poolingformer: Long Document Modeling with Pooling Attention
Hang Zhang
Yeyun Gong
Yelong Shen
Weisheng Li
Jiancheng Lv
Nan Duan
Weizhu Chen
35
98
0
10 May 2021
Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents
Chaojun Xiao
Xueyu Hu
Zhiyuan Liu
Cunchao Tu
Maosong Sun
AILaw
ELM
37
229
0
09 May 2021
Towards Clinical Encounter Summarization: Learning to Compose Discharge Summaries from Prior Notes
Han-Chin Shing
Chaitanya P. Shivade
Nima Pourdamghani
Feng Nan
Philip Resnik
Douglas W. Oard
Parminder Bhatia
54
24
0
27 Apr 2021
Case-based Reasoning for Natural Language Queries over Knowledge Bases
Rajarshi Das
Manzil Zaheer
Dung Ngoc Thai
Ameya Godbole
Ethan Perez
Jay Yoon Lee
Lizhen Tan
L. Polymenakos
Andrew McCallum
20
162
0
18 Apr 2021
FUDGE: Controlled Text Generation With Future Discriminators
Kevin Kaichuang Yang
Dan Klein
19
313
0
12 Apr 2021
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
Pengchuan Zhang
Xiyang Dai
Jianwei Yang
Bin Xiao
Lu Yuan
Lei Zhang
Jianfeng Gao
ViT
25
328
0
29 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
60
92
0
23 Mar 2021
Tiny Transformers for Environmental Sound Classification at the Edge
David Elliott
Carlos E. Otero
Steven Wyatt
Evan Martino
21
15
0
22 Mar 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
27
210
0
11 Mar 2021
Generating Images with Sparse Representations
C. Nash
Jacob Menick
Sander Dieleman
Peter W. Battaglia
19
199
0
05 Mar 2021
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
Yihe Dong
Jean-Baptiste Cordonnier
Andreas Loukas
32
373
0
05 Mar 2021
SparseBERT: Rethinking the Importance Analysis in Self-attention
Han Shi
Jiahui Gao
Xiaozhe Ren
Hang Xu
Xiaodan Liang
Zhenguo Li
James T. Kwok
23
54
0
25 Feb 2021
Optimizing Inference Performance of Transformers on CPUs
D. Dice
Alex Kogan
19
15
0
12 Feb 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
11
2,073
0
11 Jan 2021
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
63
52
0
31 Dec 2020
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
32
17
0
30 Dec 2020
Code Summarization with Structure-induced Transformer
Hongqiu Wu
Hai Zhao
Min Zhang
41
84
0
29 Dec 2020
Universal Sentence Representation Learning with Conditional Masked Language Model
Ziyi Yang
Yinfei Yang
Daniel Matthew Cer
Jax Law
Eric F. Darve
SSL
24
57
0
28 Dec 2020
Rethinking Transformer-based Set Prediction for Object Detection
Zhiqing Sun
Shengcao Cao
Yiming Yang
Kris M. Kitani
ViT
21
319
0
21 Nov 2020
Multi-document Summarization via Deep Learning Techniques: A Survey
Congbo Ma
W. Zhang
Mingyu Guo
Hu Wang
Quan Z. Sheng
13
125
0
10 Nov 2020
Open Question Answering over Tables and Text
Wenhu Chen
Ming-Wei Chang
Eva Schlinger
W. Wang
William W. Cohen
LMTD
RALM
31
193
0
20 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
219
610
0
13 Oct 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
82
1,101
0
14 Sep 2020
Sparsifying Transformer Models with Trainable Representation Pooling
Michal Pietruszka
Łukasz Borchmann
Lukasz Garncarek
13
10
0
10 Sep 2020
Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems
Andrea Madotto
Zihan Liu
Zhaojiang Lin
Pascale Fung
38
58
0
14 Aug 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
28
3,913
0
10 Apr 2020
Data Augmentation using Pre-trained Transformer Models
Varun Kumar
Ashutosh Choudhary
Eunah Cho
VLM
214
315
0
04 Mar 2020
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
Sandeep Subramanian
Raymond Li
Jonathan Pilault
C. Pal
235
215
0
07 Sep 2019
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
258
1,432
0
22 Aug 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
Previous
1
2
3
4
5
6
7