Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.14062
Cited By
Big Bird: Transformers for Longer Sequences
28 July 2020
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
Santiago Ontanon
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Big Bird: Transformers for Longer Sequences"
50 / 322 papers shown
Title
DLUE: Benchmarking Document Language Understanding
Ruoxi Xu
Hongyu Lin
Xinyan Guan
Xianpei Han
Yingfei Sun
Le Sun
ELM
18
0
0
16 May 2023
SKI to go Faster: Accelerating Toeplitz Neural Networks via Asymmetric Kernels
Alexander Moreno
Jonathan Mei
Luke Walters
15
0
0
15 May 2023
A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization
Chenhui Shen
Liying Cheng
Xuan-Phi Nguyen
Yang You
Lidong Bing
20
10
0
15 May 2023
Legal Extractive Summarization of U.S. Court Opinions
Emmanuel J. Bauer
Dominik Stammbach
Nianlong Gu
Elliott Ash
AILaw
ELM
18
7
0
15 May 2023
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Zhanpeng Zeng
Cole Hawkins
Min-Fong Hong
Aston Zhang
Nikolaos Pappas
Vikas Singh
Shuai Zheng
19
6
0
07 May 2023
The Role of Global and Local Context in Named Entity Recognition
Arthur Amalvy
Vincent Labatut
Richard Dufour
38
4
0
04 May 2023
Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
Sonal Sannigrahi
Josef van Genabith
C. España-Bonet
AILaw
34
4
0
28 Apr 2023
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework
Bin Wang
Xinnian Liang
Jian Yang
Huijia Huang
Shuangzhi Wu
Peihao Wu
Lu Lu
Zejun Ma
Zhoujun Li
LLMAG
KELM
RALM
94
25
0
26 Apr 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Mikhail Burtsev
LRM
14
87
0
19 Apr 2023
Improving Autoregressive NLP Tasks via Modular Linearized Attention
Victor Agostinelli
Lizhong Chen
22
1
0
17 Apr 2023
Explicit and Implicit Semantic Ranking Framework
Xiaofeng Zhu
Thomas Lin
Vishal Anand
M. Calderwood
Eric Clausen-Brown
Gord Lueck
Wen-wai Yim
Cheng Wu
OffRL
10
2
0
11 Apr 2023
Dialogue-Contextualized Re-ranking for Medical History-Taking
Jian Zhu
Ilya Valmianski
Anitha Kannan
19
1
0
04 Apr 2023
SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism
Mehwish Fatima
Tim Kolber
K. Markert
Michael Strube
21
0
0
04 Apr 2023
You Only Segment Once: Towards Real-Time Panoptic Segmentation
Jie Hu
Linyan Huang
Tianhe Ren
Shengchuan Zhang
Rongrong Ji
Liujuan Cao
SSeg
46
54
0
26 Mar 2023
Towards Accurate Post-Training Quantization for Vision Transformer
Yifu Ding
Haotong Qin
Qing-Yu Yan
Z. Chai
Junjie Liu
Xiaolin K. Wei
Xianglong Liu
MQ
54
68
0
25 Mar 2023
Lay Text Summarisation Using Natural Language Processing: A Narrative Literature Review
Oliver Vinzelberg
M. Jenkins
Gordon Morison
David McMinn
Z. Tieges
27
6
0
24 Mar 2023
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
Jaehoon Yoo
Semin Kim
Doyup Lee
Chiheon Kim
Seunghoon Hong
31
3
0
20 Mar 2023
Generating Query Focused Summaries without Fine-tuning the Transformer-based Pre-trained Models
D. Abdullah
Shamanth Nayak
Gandharv Suri
Yllias Chali
22
2
0
10 Mar 2023
MUX-PLMs: Data Multiplexing for High-throughput Language Models
Vishvak Murahari
A. Deshpande
Carlos E. Jimenez
Izhak Shafran
Mingqiu Wang
Yuan Cao
Karthik Narasimhan
MoE
13
5
0
24 Feb 2023
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
26
6
0
16 Feb 2023
Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering
Xanh Ho
A. Nguyen
Saku Sugawara
Akiko Aizawa
LRM
36
7
0
12 Feb 2023
Local spectral attention for full-band speech enhancement
Zhongshu Hou
Qi Hu
Kai-Jyun Chen
Jing Lu
28
0
0
11 Feb 2023
Generating a Structured Summary of Numerous Academic Papers: Dataset and Method
Shuaiqi Liu
Jiannong Cao
Ruosong Yang
Zhiyuan Wen
46
16
0
09 Feb 2023
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
28
18
0
09 Feb 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
K. Choromanski
Shanda Li
Valerii Likhosherstov
Kumar Avinava Dubey
Shengjie Luo
Di He
Yiming Yang
Tamás Sarlós
Thomas Weingarten
Adrian Weller
28
8
0
03 Feb 2023
Mnemosyne: Learning to Train Transformers with Transformers
Deepali Jain
K. Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan
OffRL
33
9
0
02 Feb 2023
A Comparative Study of Pretrained Language Models for Long Clinical Text
Yikuan Li
R. M. Wehbe
F. Ahmad
Hanyin Wang
Yuan Luo
LM&MA
ELM
VLM
MedIm
24
78
0
27 Jan 2023
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization
Laura Nguyen
Thomas Scialom
Benjamin Piwowarski
Jacopo Staiano
24
7
0
26 Jan 2023
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLM
AI4CE
LRM
14
17
0
12 Jan 2023
Dynamic Grained Encoder for Vision Transformers
Lin Song
Songyang Zhang
Songtao Liu
Zeming Li
Xuming He
Hongbin Sun
Jian-jun Sun
Nanning Zheng
ViT
26
34
0
10 Jan 2023
Semi-Structured Object Sequence Encoders
V. Rudramurthy
Riyaz Ahmad Bhat
Chulaka Gunasekara
Siva Sankalp Patel
H. Wan
Tejas I. Dhamecha
Danish Contractor
Marina Danilevsky
59
0
0
03 Jan 2023
GPT Takes the Bar Exam
M. Bommarito
Daniel Martin Katz
ELM
28
152
0
29 Dec 2022
A Length-Extrapolatable Transformer
Yutao Sun
Li Dong
Barun Patra
Shuming Ma
Shaohan Huang
Alon Benhaim
Vishrav Chaudhary
Xia Song
Furu Wei
30
115
0
20 Dec 2022
Medical Knowledge Graph QA for Drug-Drug Interaction Prediction based on Multi-hop Machine Reading Comprehension
Peng Gao
Feng Gao
Jiancheng Ni
Yu Wang
Fei-Yue Wang
12
2
0
19 Dec 2022
OASum: Large-Scale Open Domain Aspect-based Summarization
Xianjun Yang
Kaiqiang Song
Sangwoo Cho
Xiaoyang Wang
Xiaoman Pan
Linda R. Petzold
Dong Yu
RALM
26
24
0
19 Dec 2022
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
120
36
0
15 Dec 2022
MORTY: Structured Summarization for Targeted Information Extraction from Scholarly Articles
M. Y. Jaradeh
M. Stocker
Sören Auer
25
1
0
11 Dec 2022
MOPRD: A multidisciplinary open peer review dataset
Jialiang Lin
Jiaxin Song
Zhangping Zhou
Yidong Chen
X. Shi
25
11
0
09 Dec 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
27
11
0
30 Nov 2022
SeDR: Segment Representation Learning for Long Documents Dense Retrieval
Junying Chen
Qingcai Chen
Dongfang Li
Yutao Huang
20
6
0
20 Nov 2022
GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse state
Junyi Bian
Xiaodi Huang
Hong Zhou
Shanfeng Zhu
19
11
0
18 Nov 2022
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
21
21
0
15 Nov 2022
DeepParliament: A Legal domain Benchmark & Dataset for Parliament Bills Prediction
Ankit Pal
AILaw
28
0
0
15 Nov 2022
Cracking Double-Blind Review: Authorship Attribution with Deep Learning
L. Bauersfeld
Angel Romero
Manasi Muglikar
Davide Scaramuzza
6
5
0
14 Nov 2022
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLM
MedIm
30
3
0
14 Nov 2022
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention
Wenhao Li
Xiaoyuan Yi
Jinyi Hu
Maosong Sun
Xing Xie
23
0
0
14 Nov 2022
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Jyotikrishna Dass
Shang Wu
Huihong Shi
Chaojian Li
Zhifan Ye
Zhongfeng Wang
Yingyan Lin
17
49
0
09 Nov 2022
Discovering ordinary differential equations that govern time-series
Soren Becker
M. Klein
Alexander Neitz
Giambattista Parascandolo
Niki Kilbertus
AI4TS
19
4
0
05 Nov 2022
BERT for Long Documents: A Case Study of Automated ICD Coding
Arash Afkanpour
Shabir Adeel
H. Bassani
Arkady Epshteyn
Hongbo Fan
...
Sanjana Woonna
S. Zamani
Elli Kanal
M. Fomitchev
Donny Cheung
36
14
0
04 Nov 2022
Previous
1
2
3
4
5
6
7
Next