Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04768
Cited By
Linformer: Self-Attention with Linear Complexity
8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Linformer: Self-Attention with Linear Complexity"
50 / 1,050 papers shown
Title
Studying inductive biases in image classification task
N. Arizumi
31
1
0
31 Oct 2022
XNOR-FORMER: Learning Accurate Approximations in Long Speech Transformers
Roshan S. Sharma
Bhiksha Raj
28
3
0
29 Oct 2022
Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost
Sungjun Cho
Seonwoo Min
Jinwoo Kim
Moontae Lee
Honglak Lee
Seunghoon Hong
42
3
0
27 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
45
50
0
25 Oct 2022
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives
Carlos Hernandez-Olivan
Javier Hernandez-Olivan
J. R. Beltrán
MGen
47
6
0
25 Oct 2022
How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
Samuel Cahyawijaya
Bryan Wilie
Holy Lovenia
Huang Zhong
Mingqian Zhong
Yuk-Yu Nancy Ip
Pascale Fung
LM&MA
28
2
0
25 Oct 2022
Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences
Aosong Feng
Irene Z Li
Yuang Jiang
Rex Ying
37
18
0
21 Oct 2022
Mitigating spectral bias for the multiscale operator learning
Xinliang Liu
Bo Xu
Shuhao Cao
Lei Zhang
AI4CE
38
26
0
19 Oct 2022
Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation
Botao Yu
Peiling Lu
Rui Wang
Wei Hu
Xu Tan
Wei Ye
Shikun Zhang
Tao Qin
Tie-Yan Liu
MGen
35
55
0
19 Oct 2022
The Devil in Linear Transformer
Zhen Qin
Xiaodong Han
Weixuan Sun
Dongxu Li
Lingpeng Kong
Nick Barnes
Yiran Zhong
36
71
0
19 Oct 2022
Dense but Efficient VideoQA for Intricate Compositional Reasoning
Jihyeon Janel Lee
Wooyoung Kang
Eun-Sol Kim
CoGe
24
3
0
19 Oct 2022
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
51
427
0
17 Oct 2022
What Makes Convolutional Models Great on Long Sequence Modeling?
Yuhong Li
Tianle Cai
Yi Zhang
De-huai Chen
Debadeepta Dey
VLM
39
96
0
17 Oct 2022
Modeling Context With Linear Attention for Scalable Document-Level Translation
Zhaofeng Wu
Hao Peng
Nikolaos Pappas
Noah A. Smith
22
3
0
16 Oct 2022
Linear Video Transformer with Feature Fixation
Kaiyue Lu
Zexia Liu
Jianyuan Wang
Weixuan Sun
Zhen Qin
...
Xuyang Shen
Huizhong Deng
Xiaodong Han
Yuchao Dai
Yiran Zhong
35
4
0
15 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
46
9
0
14 Oct 2022
Exploring Long-Sequence Masked Autoencoders
Ronghang Hu
Shoubhik Debnath
Saining Xie
Xinlei Chen
8
18
0
13 Oct 2022
RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer
Jian Wang
Chen-xi Gou
Qiman Wu
Haocheng Feng
Junyu Han
Errui Ding
Jingdong Wang
ViT
36
96
0
13 Oct 2022
LSG Attention: Extrapolation of pretrained Transformers to long sequences
Charles Condevaux
S. Harispe
40
24
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
36
47
0
13 Oct 2022
OpenCQA: Open-ended Question Answering with Charts
Shankar Kantharaj
Do Xuan Long
Rixie Tiffany Ko Leong
J. Tan
Enamul Hoque
Shafiq Joty
34
47
0
12 Oct 2022
Designing Robust Transformers using Robust Kernel Density Estimation
Xing Han
Tongzheng Ren
T. Nguyen
Khai Nguyen
Joydeep Ghosh
Nhat Ho
32
6
0
11 Oct 2022
Memory transformers for full context and high-resolution 3D Medical Segmentation
Loic Themyr
Clément Rambour
Nicolas Thome
Toby Collins
Alexandre Hostettler
ViT
MedIm
34
4
0
11 Oct 2022
Turbo Training with Token Dropout
Tengda Han
Weidi Xie
Andrew Zisserman
ViT
39
10
0
10 Oct 2022
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
H. H. Mao
71
21
0
09 Oct 2022
Bird-Eye Transformers for Text Generation Models
Lei Sha
Yuhang Song
Yordan Yordanov
Tommaso Salvatori
Thomas Lukasiewicz
30
0
0
08 Oct 2022
Hierarchical Graph Transformer with Adaptive Node Sampling
Zaixin Zhang
Qi Liu
Qingyong Hu
Cheekong Lee
81
82
0
08 Oct 2022
Compressed Vision for Efficient Video Understanding
Olivia Wiles
João Carreira
Iain Barr
Andrew Zisserman
Mateusz Malinowski
27
7
0
06 Oct 2022
Temporally Consistent Transformers for Video Generation
Wilson Yan
Danijar Hafner
Stephen James
Pieter Abbeel
DiffM
27
28
0
05 Oct 2022
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability
Yufan Zhuang
Zihan Wang
Fangbo Tao
Jingbo Shang
ViT
AI4TS
37
3
0
05 Oct 2022
DARTFormer: Finding The Best Type Of Attention
Jason Brown
Yiren Zhao
Ilia Shumailov
Robert D. Mullins
32
6
0
02 Oct 2022
Wide Attention Is The Way Forward For Transformers?
Jason Brown
Yiren Zhao
Ilia Shumailov
Robert D. Mullins
21
7
0
02 Oct 2022
Grouped self-attention mechanism for a memory-efficient Transformer
Bumjun Jung
Yusuke Mukuta
Tatsuya Harada
AI4TS
14
3
0
02 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
Transformers for Object Detection in Large Point Clouds
Felicia Ruppel
F. Faion
Claudius Gläser
Klaus C. J. Dietmayer
ViT
34
5
0
30 Sep 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition
Martin H. Radfar
Rohit Barnwal
Rupak Vignesh Swaminathan
Feng-Ju Chang
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
39
13
0
29 Sep 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
30
23
0
28 Sep 2022
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
107
98
0
26 Sep 2022
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
38
183
0
21 Sep 2022
Adapting Pretrained Text-to-Text Models for Long Text Sequences
Wenhan Xiong
Anchit Gupta
Shubham Toshniwal
Yashar Mehdad
Wen-tau Yih
RALM
VLM
62
30
0
21 Sep 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
40
58
0
20 Sep 2022
Graph Reasoning Transformer for Image Parsing
Dong Zhang
Jinhui Tang
Kwang-Ting Cheng
ViT
26
16
0
20 Sep 2022
Real-time Online Video Detection with Temporal Smoothing Transformers
Yue Zhao
Philipp Krahenbuhl
ViT
69
57
0
19 Sep 2022
Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence
Sunghwan Hong
Seokju Cho
Seung Wook Kim
Stephen Lin
3DV
47
4
0
19 Sep 2022
Hydra Attention: Efficient Attention with Many Heads
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Judy Hoffman
99
78
0
15 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
82
31
0
14 Sep 2022
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
Chinmay Hegde
73
111
0
11 Sep 2022
Pre-Training a Graph Recurrent Network for Language Representation
Yile Wang
Linyi Yang
Zhiyang Teng
M. Zhou
Yue Zhang
GNN
38
1
0
08 Sep 2022
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
54
6
0
30 Aug 2022
Learning Heterogeneous Interaction Strengths by Trajectory Prediction with Graph Neural Network
Seungwoong Ha
Hawoong Jeong
38
5
0
28 Aug 2022
Previous
1
2
3
...
12
13
14
...
19
20
21
Next