Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.05507
Cited By
Compressive Transformers for Long-Range Sequence Modelling
13 November 2019
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
RALM
VLM
KELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Compressive Transformers for Long-Range Sequence Modelling"
50 / 232 papers shown
Title
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
115
66
0
15 Feb 2022
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Chao-Yuan Wu
Yanghao Li
K. Mangalam
Haoqi Fan
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
127
201
0
20 Jan 2022
Datasheet for the Pile
Stella Biderman
Kieran Bicheno
Leo Gao
102
36
0
13 Jan 2022
Simple Local Attentions Remain Competitive for Long-Context Tasks
Wenhan Xiong
Barlas Ouguz
Anchit Gupta
Xilun Chen
Diana Liskovich
Omer Levy
Wen-tau Yih
Yashar Mehdad
99
29
0
14 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
89
3
0
10 Dec 2021
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
60
7
0
23 Nov 2021
GNN-LM: Language Modeling based on Global Contexts via GNN
Yuxian Meng
Shi Zong
Xiaoya Li
Xiaofei Sun
Tianwei Zhang
Leilei Gan
Jiwei Li
LRM
127
39
0
17 Oct 2021
Speech Summarization using Restricted Self-Attention
Roshan S. Sharma
Shruti Palaskar
A. Black
Florian Metze
58
34
0
12 Oct 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
143
71
0
08 Oct 2021
ABC: Attention with Bounded-memory Control
Hao Peng
Jungo Kasai
Nikolaos Pappas
Dani Yogatama
Zhaofeng Wu
Lingpeng Kong
Roy Schwartz
Noah A. Smith
125
22
0
06 Oct 2021
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
125
45
0
21 Sep 2021
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
91
84
0
19 Sep 2021
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
277
156
0
17 Sep 2021
Space Time Recurrent Memory Network
Hung-Cuong Nguyen
Chanho Kim
Fuxin Li
139
3
0
14 Sep 2021
∞
\infty
∞
-former: Infinite Memory Transformer
Pedro Henrique Martins
Zita Marinho
André F. T. Martins
98
11
0
01 Sep 2021
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation
Jian Guan
Zhuoer Feng
Yamei Chen
Ru He
Xiaoxi Mao
Changjie Fan
Minlie Huang
120
33
0
30 Aug 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
350
779
0
27 Aug 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
103
12
0
24 Aug 2021
Making Transformers Solve Compositional Tasks
Santiago Ontañón
Joshua Ainslie
Vaclav Cvicek
Zachary Kenneth Fisher
109
74
0
09 Aug 2021
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
130
64
0
13 Jul 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
166
80
0
12 Jul 2021
Long Short-Term Transformer for Online Action Detection
Mingze Xu
Yuanjun Xiong
Hao Chen
Xinyu Li
Wei Xia
Zhuowen Tu
Stefano Soatto
ViT
154
137
0
07 Jul 2021
Long-Short Transformer: Efficient Transformers for Language and Vision
Chen Zhu
Ming-Yu Liu
Chaowei Xiao
Mohammad Shoeybi
Tom Goldstein
Anima Anandkumar
Bryan Catanzaro
ViT
VLM
119
133
0
05 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
187
995
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
86
437
0
01 Jul 2021
Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network
Zhibin Duan
Dongsheng Wang
Bo Chen
Chaojie Wang
Wenchao Chen
Yewen Li
Jie Ren
Mingyuan Zhou
BDL
98
42
0
30 Jun 2021
What Context Features Can Transformer Language Models Use?
J. O'Connor
Jacob Andreas
KELM
77
79
0
15 Jun 2021
An Automated Quality Evaluation Framework of Psychotherapy Conversations with Local Quality Estimates
Zhuohao Chen
Nikolaos Flemotomos
Karan Singla
Torrey A. Creed
David C. Atkins
Shrikanth Narayanan
65
5
0
15 Jun 2021
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
Kazuki Irie
Imanol Schlag
Róbert Csordás
Jürgen Schmidhuber
118
64
0
11 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
202
1,150
0
08 Jun 2021
Chasing Sparsity in Vision Transformers: An End-to-End Exploration
Tianlong Chen
Yu Cheng
Zhe Gan
Lu Yuan
Lei Zhang
Zhangyang Wang
ViT
70
224
0
08 Jun 2021
Luna: Linear Unified Nested Attention
Xuezhe Ma
Xiang Kong
Sinong Wang
Chunting Zhou
Jonathan May
Hao Ma
Luke Zettlemoyer
91
113
0
03 Jun 2021
Learning to Rehearse in Long Sequence Memorization
Zhu Zhang
Chang Zhou
Jianxin Ma
Zhijie Lin
Jingren Zhou
Hongxia Yang
Zhou Zhao
RALM
33
9
0
02 Jun 2021
Less is More: Pay Less Attention in Vision Transformers
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
141
87
0
29 May 2021
An Attention Free Transformer
Shuangfei Zhai
Walter A. Talbott
Nitish Srivastava
Chen Huang
Hanlin Goh
Ruixiang Zhang
J. Susskind
ViT
94
132
0
28 May 2021
Towards mental time travel: a hierarchical memory for reinforcement learning agents
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Andrea Banino
Felix Hill
92
47
0
28 May 2021
Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction
Minghan Shen
Pratyay Banerjee
Chitta Baral
SSL
55
5
0
26 May 2021
ReadTwice: Reading Very Large Documents with Memories
Yury Zemlyanskiy
Joshua Ainslie
Michiel de Jong
Philip Pham
Ilya Eckstein
Fei Sha
AIMat
RALM
88
18
0
10 May 2021
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation
Emilio Parisotto
Ruslan Salakhutdinov
108
46
0
04 Apr 2021
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
Pengchuan Zhang
Xiyang Dai
Jianwei Yang
Bin Xiao
Lu Yuan
Lei Zhang
Jianfeng Gao
ViT
116
337
0
29 Mar 2021
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
137
104
0
26 Mar 2021
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
114
67
0
24 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
95
130
0
19 Mar 2021
Hurdles to Progress in Long-form Question Answering
Kalpesh Krishna
Aurko Roy
Mohit Iyyer
72
200
0
10 Mar 2021
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
214
1,030
0
04 Mar 2021
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
136
362
0
03 Mar 2021
Coordination Among Neural Modules Through a Shared Global Workspace
Anirudh Goyal
Aniket Didolkar
Alex Lamb
Kartikeya Badola
Nan Rosemary Ke
Nasim Rahaman
Jonathan Binas
Charles Blundell
Michael C. Mozer
Yoshua Bengio
226
99
0
01 Mar 2021
Linear Transformers Are Secretly Fast Weight Programmers
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
153
252
0
22 Feb 2021
Dynamic Memory based Attention Network for Sequential Recommendation
Qiaoyu Tan
Jianwei Zhang
Ninghao Liu
Xiao Shi Huang
Hongxia Yang
Jingren Zhou
Helen Zhou
HAI
168
66
0
18 Feb 2021
Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech Recognition
Priyabrata Karmakar
S. Teng
Guojun Lu
50
27
0
14 Feb 2021
Previous
1
2
3
4
5
Next