ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.05507
  4. Cited By
Compressive Transformers for Long-Range Sequence Modelling

Compressive Transformers for Long-Range Sequence Modelling

13 November 2019
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
    RALMVLMKELM
ArXiv (abs)PDFHTML

Papers citing "Compressive Transformers for Long-Range Sequence Modelling"

32 / 232 papers shown
Title
Adaptive Semiparametric Language Models
Adaptive Semiparametric Language Models
Dani Yogatama
Cyprien de Masson dÁutume
Lingpeng Kong
KELMRALM
105
100
0
04 Feb 2021
Can We Automate Scientific Reviewing?
Can We Automate Scientific Reviewing?
Weizhe Yuan
Pengfei Liu
Graham Neubig
161
90
0
30 Jan 2021
Compound Word Transformer: Learning to Compose Full-Song Music over
  Dynamic Directed Hypergraphs
Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs
Wen-Yi Hsiao
Jen-Yu Liu
Yin-Cheng Yeh
Yi-Hsuan Yang
193
187
0
07 Jan 2021
Shortformer: Better Language Modeling using Shorter Inputs
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
309
91
0
31 Dec 2020
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
116
55
0
31 Dec 2020
Rethinking Document-level Neural Machine Translation
Rethinking Document-level Neural Machine Translation
Zewei Sun
Mingxuan Wang
Hao Zhou
Chengqi Zhao
Shujian Huang
Jiajun Chen
Lei Li
VLM
149
48
0
18 Oct 2020
Memformer: A Memory-Augmented Transformer for Sequence Modeling
Memformer: A Memory-Augmented Transformer for Sequence Modeling
Qingyang Wu
Zhenzhong Lan
Kun Qian
Jing Gu
A. Geramifard
Zhou Yu
73
49
0
14 Oct 2020
Zero-shot Entity Linking with Efficient Long Range Sequence Modeling
Zero-shot Entity Linking with Efficient Long Range Sequence Modeling
Zonghai Yao
Liangliang Cao
Huapu Pan
VLM
105
21
0
12 Oct 2020
Rethinking Attention with Performers
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
198
1,605
0
30 Sep 2020
Automated Source Code Generation and Auto-completion Using Deep
  Learning: Comparing and Discussing Current Language-Model-Related Approaches
Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches
Juan Cruz-Benito
Sanjay Vishwakarma
Francisco Martín-Fernández
Ismael Faro Ibm Quantum
66
31
0
16 Sep 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
236
1,136
0
14 Sep 2020
Cluster-Former: Clustering-based Sparse Transformer for Long-Range
  Dependency Encoding
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Shuohang Wang
Luowei Zhou
Zhe Gan
Yen-Chun Chen
Yuwei Fang
S. Sun
Yu Cheng
Jingjing Liu
93
29
0
13 Sep 2020
HiPPO: Recurrent Memory with Optimal Polynomial Projections
HiPPO: Recurrent Memory with Optimal Polynomial Projections
Albert Gu
Tri Dao
Stefano Ermon
Atri Rudra
Christopher Ré
163
548
0
17 Aug 2020
Neural Language Generation: Formulation, Methods, and Evaluation
Neural Language Generation: Formulation, Methods, and Evaluation
Cristina Garbacea
Qiaozhu Mei
158
30
0
31 Jul 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
616
2,109
0
28 Jul 2020
Do Transformers Need Deep Long-Range Memory
Do Transformers Need Deep Long-Range Memory
Jack W. Rae
Ali Razavi
RALM
78
41
0
07 Jul 2020
Memory Transformer
Memory Transformer
Andrey Kravchenko
Yuri Kuratov
Anton Peganov
Grigory V. Sapunov
RALM
78
72
0
20 Jun 2020
Masked Language Modeling for Proteins via Linearly Scalable Long-Context
  Transformers
Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Peter Hawkins
Jared Davis
David Belanger
Lucy J. Colwell
Adrian Weller
100
86
0
05 Jun 2020
GMAT: Global Memory Augmentation for Transformers
GMAT: Global Memory Augmentation for Transformers
Ankit Gupta
Jonathan Berant
RALM
81
50
0
05 Jun 2020
Exploring Transformers for Large-Scale Speech Recognition
Exploring Transformers for Large-Scale Speech Recognition
Liang Lu
Changliang Liu
Jinyu Li
Jiawei Liu
67
41
0
19 May 2020
Multi-scale Transformer Language Models
Multi-scale Transformer Language Models
Sandeep Subramanian
R. Collobert
MarcÁurelio Ranzato
Y-Lan Boureau
58
13
0
01 May 2020
Recipes for building an open-domain chatbot
Recipes for building an open-domain chatbot
Stephen Roller
Emily Dinan
Naman Goyal
Da Ju
Mary Williamson
...
Myle Ott
Kurt Shuster
Eric Michael Smith
Y-Lan Boureau
Jason Weston
ALM
168
1,020
0
28 Apr 2020
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical
  Encoder for Long-Form Document Matching
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching
Liu Yang
Mingyang Zhang
Cheng Li
Michael Bendersky
Marc Najork
96
89
0
26 Apr 2020
ETC: Encoding Long and Structured Inputs in Transformers
ETC: Encoding Long and Structured Inputs in Transformers
Joshua Ainslie
Santiago Ontanon
Chris Alberti
Vaclav Cvicek
Zachary Kenneth Fisher
Philip Pham
Anirudh Ravula
Sumit Sanghai
Qifan Wang
Li Yang
88
55
0
17 Apr 2020
Training with Quantization Noise for Extreme Model Compression
Training with Quantization Noise for Extreme Model Compression
Angela Fan
Pierre Stock
Benjamin Graham
Edouard Grave
Remi Gribonval
Hervé Jégou
Armand Joulin
MQ
113
246
0
15 Apr 2020
Longformer: The Long-Document Transformer
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALMVLM
232
4,110
0
10 Apr 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
413
607
0
12 Mar 2020
ProGen: Language Modeling for Protein Generation
ProGen: Language Modeling for Protein Generation
Ali Madani
Bryan McCann
Nikhil Naik
N. Keskar
N. Anand
Raphael R. Eguchi
Po-Ssu Huang
R. Socher
96
288
0
08 Mar 2020
Sparse Sinkhorn Attention
Sparse Sinkhorn Attention
Yi Tay
Dara Bahri
Liu Yang
Donald Metzler
Da-Cheng Juan
102
342
0
26 Feb 2020
Time-aware Large Kernel Convolutions
Time-aware Large Kernel Convolutions
Vasileios Lioutas
Yuhong Guo
AI4TS
97
29
0
08 Feb 2020
Improving Transformer Models by Reordering their Sublayers
Improving Transformer Models by Reordering their Sublayers
Ofir Press
Noah A. Smith
Omer Levy
87
88
0
10 Nov 2019
Natural Language Processing: State of The Art, Current Trends and
  Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
Diksha Khurana
Aditya Koli
Kiran Khatter
Sukhdev Singh
65
1,085
0
17 Aug 2017
Previous
12345