ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.14062
  4. Cited By
Big Bird: Transformers for Longer Sequences

Big Bird: Transformers for Longer Sequences

28 July 2020
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
Santiago Ontanon
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
    VLM
ArXivPDFHTML

Papers citing "Big Bird: Transformers for Longer Sequences"

50 / 313 papers shown
Title
Contextual Chart Generation for Cyber Deception
Contextual Chart Generation for Cyber Deception
David D. Nguyen
David Liebowitz
Surya Nepal
S. Kanhere
Sharif Abuadbba
41
0
0
07 Apr 2024
On the Theoretical Expressive Power and the Design Space of Higher-Order
  Graph Transformers
On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers
Cai Zhou
Rose Yu
Yusu Wang
32
7
0
04 Apr 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
42
6
0
28 Feb 2024
Transformers are Expressive, But Are They Expressive Enough for
  Regression?
Transformers are Expressive, But Are They Expressive Enough for Regression?
Swaroop Nath
H. Khadilkar
Pushpak Bhattacharyya
26
3
0
23 Feb 2024
Multimodal Transformer With a Low-Computational-Cost Guarantee
Multimodal Transformer With a Low-Computational-Cost Guarantee
Sungjin Park
Edward Choi
46
1
0
23 Feb 2024
Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models
Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models
Guanming Xiong
Junwei Bao
Wen Zhao
KELM
51
8
0
23 Feb 2024
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark
Xiuying Chen
Tairan Wang
Qingqing Zhu
Taicheng Guo
Shen Gao
Zhiyong Lu
Xin Gao
Xiangliang Zhang
70
2
0
22 Feb 2024
On the Efficacy of Eviction Policy for Key-Value Constrained Generative
  Language Model Inference
On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model Inference
Siyu Ren
Kenny Q. Zhu
18
27
0
09 Feb 2024
Jacquard V2: Refining Datasets using the Human In the Loop Data
  Correction Method
Jacquard V2: Refining Datasets using the Human In the Loop Data Correction Method
Qiuhao Li
Shenghai Yuan
27
5
0
08 Feb 2024
Interpretation of Intracardiac Electrograms Through Textual
  Representations
Interpretation of Intracardiac Electrograms Through Textual Representations
William Jongwon Han
Diana Gomez
Avi Alok
Chaojing Duan
Michael A. Rosenberg
Douglas Weber
Emerson Liu
Ding Zhao
23
1
0
02 Feb 2024
FIMBA: Evaluating the Robustness of AI in Genomics via Feature
  Importance Adversarial Attacks
FIMBA: Evaluating the Robustness of AI in Genomics via Feature Importance Adversarial Attacks
Heorhii Skovorodnikov
Hoda AlKhzaimi
AAML
20
1
0
19 Jan 2024
LoMA: Lossless Compressed Memory Attention
LoMA: Lossless Compressed Memory Attention
Yumeng Wang
Zhenyang Xiao
14
3
0
16 Jan 2024
Uncertainty Guided Global Memory Improves Multi-Hop Question Answering
Uncertainty Guided Global Memory Improves Multi-Hop Question Answering
Alsu Sagirova
Mikhail Burtsev
RALM
28
1
0
29 Nov 2023
Meta-learning of semi-supervised learning from tasks with heterogeneous
  attribute spaces
Meta-learning of semi-supervised learning from tasks with heterogeneous attribute spaces
Tomoharu Iwata
Atsutoshi Kumagai
24
2
0
09 Nov 2023
Neural Atoms: Propagating Long-range Interaction in Molecular Graphs
  through Efficient Communication Channel
Neural Atoms: Propagating Long-range Interaction in Molecular Graphs through Efficient Communication Channel
Xuan Li
Zhanke Zhou
Jiangchao Yao
Yu Rong
Lu Zhang
Bo Han
31
3
0
02 Nov 2023
Large-Scale and Multi-Perspective Opinion Summarization with Diverse
  Review Subsets
Large-Scale and Multi-Perspective Opinion Summarization with Diverse Review Subsets
Han Jiang
Rui Wang
Zhihua Wei
Yu Li
Xinpeng Wang
35
4
0
20 Oct 2023
Multilingual estimation of political-party positioning: From label
  aggregation to long-input Transformers
Multilingual estimation of political-party positioning: From label aggregation to long-input Transformers
Dmitry Nikolaev
Tanise Ceron
Sebastian Padó
19
1
0
19 Oct 2023
Surveying the Landscape of Text Summarization with Deep Learning: A
  Comprehensive Review
Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review
Guanghua Wang
Weili Wu
AI4TS
AILaw
33
3
0
13 Oct 2023
Larth: Dataset and Machine Translation for Etruscan
Larth: Dataset and Machine Translation for Etruscan
Gianluca Vico
Gerasimos Spanakis
6
1
0
09 Oct 2023
MenatQA: A New Dataset for Testing the Temporal Comprehension and
  Reasoning Abilities of Large Language Models
MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
Yifan Wei
Yisong Su
Huanhuan Ma
Xiaoyan Yu
Fangyu Lei
Yuanzhe Zhang
Jun Zhao
Kang Liu
LRM
17
10
0
08 Oct 2023
Training a Large Video Model on a Single Machine in a Day
Training a Large Video Model on a Single Machine in a Day
Yue Zhao
Philipp Krahenbuhl
VLM
29
15
0
28 Sep 2023
Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based
  Sentiment Quadruple Analysis
Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based Sentiment Quadruple Analysis
Yuqing Li
Wenyuan Zhang
Binbin Li
Siyu Jia
Zisen Qi
Xingbang Tan
34
3
0
27 Sep 2023
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen
Shengju Qian
Haotian Tang
Xin Lai
Zhijian Liu
Song Han
Jiaya Jia
37
151
0
21 Sep 2023
Language Modeling Is Compression
Language Modeling Is Compression
Grégoire Delétang
Anian Ruoss
Paul-Ambroise Duquenne
Elliot Catt
Tim Genewein
...
Wenliang Kevin Li
Matthew Aitchison
Laurent Orseau
Marcus Hutter
J. Veness
AI4CE
30
129
0
19 Sep 2023
Improving Domain Generalization for Sound Classification with Sparse
  Frequency-Regularized Transformer
Improving Domain Generalization for Sound Classification with Sparse Frequency-Regularized Transformer
Honglin Mu
Wentian Xia
Wanxiang Che
10
1
0
19 Jul 2023
Can Model Fusing Help Transformers in Long Document Classification? An
  Empirical Study
Can Model Fusing Help Transformers in Long Document Classification? An Empirical Study
Damith Premasiri
Tharindu Ranasinghe
R. Mitkov
VLM
21
1
0
18 Jul 2023
Gloss Attention for Gloss-free Sign Language Translation
Gloss Attention for Gloss-free Sign Language Translation
Aoxiong Yin
Tianyun Zhong
Lilian H. Y. Tang
Weike Jin
Tao Jin
Zhou Zhao
SLR
16
37
0
14 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
16
6
0
12 Jul 2023
A Side-by-side Comparison of Transformers for English Implicit Discourse
  Relation Classification
A Side-by-side Comparison of Transformers for English Implicit Discourse Relation Classification
Bruce W. Lee
Bongseok Yang
J. Lee
16
0
0
07 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
35
151
0
05 Jul 2023
Dipping PLMs Sauce: Bridging Structure and Text for Effective Knowledge
  Graph Completion via Conditional Soft Prompting
Dipping PLMs Sauce: Bridging Structure and Text for Effective Knowledge Graph Completion via Conditional Soft Prompting
Chen Chen
Yufei Wang
Aixin Sun
Bing Li
Kwok-Yan Lam
24
39
0
04 Jul 2023
Analyzing Multiple-Choice Reading and Listening Comprehension Tests
Analyzing Multiple-Choice Reading and Listening Comprehension Tests
Vatsal Raina
Adian Liusie
Mark J. F. Gales
ELM
33
2
0
03 Jul 2023
Span-Selective Linear Attention Transformers for Effective and Robust
  Schema-Guided Dialogue State Tracking
Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking
Björn Bebensee
Haejun Lee
23
4
0
15 Jun 2023
Training-free Diffusion Model Adaptation for Variable-Sized
  Text-to-Image Synthesis
Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
Zhiyu Jin
Xuli Shen
Bin Li
Xiangyang Xue
24
36
0
14 Jun 2023
S$^{3}$: Increasing GPU Utilization during Generative Inference for
  Higher Throughput
S3^{3}3: Increasing GPU Utilization during Generative Inference for Higher Throughput
Yunho Jin
Chun-Feng Wu
David Brooks
Gu-Yeon Wei
29
62
0
09 Jun 2023
Plug-and-Play Document Modules for Pre-trained Models
Plug-and-Play Document Modules for Pre-trained Models
Chaojun Xiao
Zhengyan Zhang
Xu Han
Chi-Min Chan
Yankai Lin
Zhiyuan Liu
Xiangyang Li
Zhonghua Li
Zhao Cao
Maosong Sun
KELM
22
5
0
28 May 2023
Incorporating Distributions of Discourse Structure for Long Document
  Abstractive Summarization
Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization
Dongqi Pu
Yifa Wang
Vera Demberg
29
21
0
26 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurélien Lucchi
Thomas Hofmann
34
53
0
25 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Ziwei He
Meng-Da Yang
Minwei Feng
Jingcheng Yin
X. Wang
Jingwen Leng
Zhouhan Lin
ViT
29
11
0
24 May 2023
Neural Machine Translation for Code Generation
Neural Machine Translation for Code Generation
K. Dharma
Clayton T. Morrison
32
4
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
76
556
0
22 May 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
21
12
0
22 May 2023
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for
  Longer Sequences
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences
David C. Uthus
Santiago Ontañón
Joshua Ainslie
Mandy Guo
VLM
28
10
0
18 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
10
0
0
17 May 2023
Ray-Patch: An Efficient Querying for Light Field Transformers
Ray-Patch: An Efficient Querying for Light Field Transformers
T. B. Martins
Javier Civera
ViT
36
0
0
16 May 2023
DLUE: Benchmarking Document Language Understanding
DLUE: Benchmarking Document Language Understanding
Ruoxi Xu
Hongyu Lin
Xinyan Guan
Xianpei Han
Yingfei Sun
Le Sun
ELM
18
0
0
16 May 2023
A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document
  Summarization
A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization
Chenhui Shen
Liying Cheng
Xuan-Phi Nguyen
Yang You
Lidong Bing
20
10
0
15 May 2023
Legal Extractive Summarization of U.S. Court Opinions
Legal Extractive Summarization of U.S. Court Opinions
Emmanuel J. Bauer
Dominik Stammbach
Nianlong Gu
Elliott Ash
AILaw
ELM
18
7
0
15 May 2023
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing
  Important Tokens
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Zhanpeng Zeng
Cole Hawkins
Min-Fong Hong
Aston Zhang
Nikolaos Pappas
Vikas Singh
Shuai Zheng
19
6
0
07 May 2023
The Role of Global and Local Context in Named Entity Recognition
The Role of Global and Local Context in Named Entity Recognition
Arthur Amalvy
Vincent Labatut
Richard Dufour
38
4
0
04 May 2023
Previous
1234567
Next