Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04768
Cited By
Linformer: Self-Attention with Linear Complexity
8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Linformer: Self-Attention with Linear Complexity"
50 / 1,050 papers shown
Title
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators
Peiyu Liu
Ze-Feng Gao
Wayne Xin Zhao
Z. Xie
Zhong-Yi Lu
Ji-Rong Wen
15
29
0
04 Jun 2021
Luna: Linear Unified Nested Attention
Xuezhe Ma
Xiang Kong
Sinong Wang
Chunting Zhou
Jonathan May
Hao Ma
Luke Zettlemoyer
33
114
0
03 Jun 2021
Container: Context Aggregation Network
Peng Gao
Jiasen Lu
Hongsheng Li
Roozbeh Mottaghi
Aniruddha Kembhavi
ViT
25
70
0
02 Jun 2021
Database Reasoning Over Text
James Thorne
Majid Yazdani
Marzieh Saeidi
Fabrizio Silvestri
Sebastian Riedel
A. Halevy
ReLM
LMTD
AI4TS
11
37
0
02 Jun 2021
Hi-Transformer: Hierarchical Interactive Transformer for Efficient and Effective Long Document Modeling
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
31
66
0
02 Jun 2021
THG: Transformer with Hyperbolic Geometry
Zhe Liu
Yibin Xu
ViT
20
1
0
01 Jun 2021
DoT: An efficient Double Transformer for NLP tasks with tables
Syrine Krichene
Thomas Müller
Julian Martin Eisenschlos
12
14
0
01 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
40
22
0
31 May 2021
Choose a Transformer: Fourier or Galerkin
Shuhao Cao
42
227
0
31 May 2021
LEAP: Learnable Pruning for Transformer-based Models
Z. Yao
Xiaoxia Wu
Linjian Ma
Sheng Shen
Kurt Keutzer
Michael W. Mahoney
Yuxiong He
30
7
0
30 May 2021
Less is More: Pay Less Attention in Vision Transformers
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
24
82
0
29 May 2021
An Attention Free Transformer
Shuangfei Zhai
Walter A. Talbott
Nitish Srivastava
Chen Huang
Hanlin Goh
Ruixiang Zhang
J. Susskind
ViT
35
128
0
28 May 2021
Towards mental time travel: a hierarchical memory for reinforcement learning agents
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Andrea Banino
Felix Hill
24
47
0
28 May 2021
Sequence Parallelism: Long Sequence Training from System Perspective
Shenggui Li
Fuzhao Xue
Chaitanya Baranwal
Yongbin Li
Yang You
22
90
0
26 May 2021
POCFormer: A Lightweight Transformer Architecture for Detection of COVID-19 Using Point of Care Ultrasound
Shehan Perera
S. Adhikari
Alper Yilmaz
MedIm
11
28
0
20 May 2021
DCAP: Deep Cross Attentional Product Network for User Response Prediction
Zekai Chen
Fangtian Zhong
Zhumin Chen
Xiao Zhang
Robert Pless
Xiuzhen Cheng
19
11
0
18 May 2021
Relative Positional Encoding for Transformers with Linear Complexity
Antoine Liutkus
Ondřej Cífka
Shih-Lun Wu
Umut Simsekli
Yi-Hsuan Yang
Gaël Richard
38
45
0
18 May 2021
Neural Error Mitigation of Near-Term Quantum Simulations
Elizabeth R. Bennewitz
Florian Hopfmueller
B. Kulchytskyy
Juan Carrasquilla
Pooya Ronagh
21
54
0
17 May 2021
Doc2Dict: Information Extraction as Text Generation
Benjamin Townsend
Eamon Ito-Fisher
Lily Zhang
Madison May
28
7
0
16 May 2021
Not All Memories are Created Equal: Learning to Forget by Expiring
Sainbayar Sukhbaatar
Da Ju
Spencer Poff
Stephen Roller
Arthur Szlam
Jason Weston
Angela Fan
CLL
21
34
0
13 May 2021
EL-Attention: Memory Efficient Lossless Attention for Generation
Yu Yan
Jiusheng Chen
Weizhen Qi
Nikhil Bhendawade
Yeyun Gong
Nan Duan
Ruofei Zhang
VLM
34
6
0
11 May 2021
Poolingformer: Long Document Modeling with Pooling Attention
Hang Zhang
Yeyun Gong
Yelong Shen
Weisheng Li
Jiancheng Lv
Nan Duan
Weizhu Chen
43
98
0
10 May 2021
T-EMDE: Sketching-based global similarity for cross-modal retrieval
Barbara Rychalska
Mikolaj Wieczorek
Jacek Dąbrowski
33
0
0
10 May 2021
Dispatcher: A Message-Passing Approach To Language Modelling
A. Cetoli
45
0
0
09 May 2021
FNet: Mixing Tokens with Fourier Transforms
James Lee-Thorp
Joshua Ainslie
Ilya Eckstein
Santiago Ontanon
47
520
0
09 May 2021
Long-Span Summarization via Local Attention and Content Selection
Potsawee Manakul
Mark Gales
21
42
0
08 May 2021
High-Resolution Optical Flow from 1D Attention and Correlation
Haofei Xu
Jiaolong Yang
Jianfei Cai
Juyong Zhang
Xin Tong
81
75
0
28 Apr 2021
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shixing Chen
Xiaohan Nie
David D. Fan
Dongqing Zhang
Vimal Bhat
Raffay Hamid
SSL
27
62
0
28 Apr 2021
Transfer training from smaller language model
Han Zhang
43
0
0
23 Apr 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
63
1,226
0
22 Apr 2021
Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model
Honai Ueoka
Yugo Murawaki
Sadao Kurohashi
13
41
0
20 Apr 2021
Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence
Bhaskar Mitra
Sebastian Hofstatter
Hamed Zamani
Nick Craswell
19
8
0
19 Apr 2021
A Simple and Effective Positional Encoding for Transformers
Pu-Chin Chen
Henry Tsai
Srinadh Bhojanapalli
Hyung Won Chung
Yin-Wen Chang
Chun-Sung Ferng
61
62
0
18 Apr 2021
Semantic Frame Forecast
Huang Chieh-Yang
Ting-Hao 'Kenneth' Huang
AI4TS
9
5
0
12 Apr 2021
Updater-Extractor Architecture for Inductive World State Representations
A. Moskvichev
James Liu
17
4
0
12 Apr 2021
Not All Attention Is All You Need
Hongqiu Wu
Hai Zhao
Min Zhang
22
9
0
10 Apr 2021
Transformers: "The End of History" for NLP?
Anton Chernyavskiy
Dmitry Ilvovsky
Preslav Nakov
47
30
0
09 Apr 2021
Fourier Image Transformer
T. Buchholz
Florian Jug
ViT
25
17
0
06 Apr 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
30
2,093
0
29 Mar 2021
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
Pengchuan Zhang
Xiyang Dai
Jianwei Yang
Bin Xiao
Lu Yuan
Lei Zhang
Jianfeng Gao
ViT
29
330
0
29 Mar 2021
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
14
93
0
26 Mar 2021
High-Fidelity Pluralistic Image Completion with Transformers
Bo Liu
Jingbo Zhang
Dongdong Chen
Jing Liao
ViT
28
231
0
25 Mar 2021
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
44
63
0
24 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
60
94
0
23 Mar 2021
Instance-level Image Retrieval using Reranking Transformers
Fuwen Tan
Jiangbo Yuan
Vicente Ordonez
ViT
28
89
0
22 Mar 2021
Self-Supervised Test-Time Learning for Reading Comprehension
Pratyay Banerjee
Tejas Gokhale
Chitta Baral
SSL
17
28
0
20 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
27
126
0
19 Mar 2021
Value-aware Approximate Attention
Ankit Gupta
Jonathan Berant
21
5
0
17 Mar 2021
Does the Magic of BERT Apply to Medical Code Assignment? A Quantitative Study
Shaoxiong Ji
M. Holtta
Pekka Marttinen
35
72
0
11 Mar 2021
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models
Sam Bond-Taylor
Adam Leach
Yang Long
Chris G. Willcocks
VLM
TPM
41
485
0
08 Mar 2021
Previous
1
2
3
...
19
20
21
Next