Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.04768
Cited By
Linformer: Self-Attention with Linear Complexity
8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Linformer: Self-Attention with Linear Complexity"
50 / 1,050 papers shown
Title
Transforming the Output of Generative Pre-trained Transformer: The Influence of the PGI Framework on Attention Dynamics
Aline Ioste
32
1
0
25 Aug 2023
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
Jiawen Xie
Pengyu Cheng
Xiao Liang
Yong Dai
Nan Du
47
7
0
25 Aug 2023
Easy attention: A simple attention mechanism for temporal predictions with transformers
Marcial Sanchis-Agudo
Yuning Wang
Roger Arnau
L. Guastoni
Jasmin Lim
Karthik Duraisamy
Ricardo Vinuesa
AI4TS
24
0
0
24 Aug 2023
Enhancing Graph Transformers with Hierarchical Distance Structural Encoding
Yuan Luo
Hongkang Li
Lei Shi
Xiao-Ming Wu
40
7
0
22 Aug 2023
A Lightweight Transformer for Faster and Robust EBSD Data Collection
Harry Dong
S. Donegan
M. Shah
Yuejie Chi
34
2
0
18 Aug 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
47
3
0
18 Aug 2023
Memory-and-Anticipation Transformer for Online Action Understanding
Jiahao Wang
Guo Chen
Yifei Huang
Liming Wang
Tong Lu
OffRL
67
37
0
15 Aug 2023
Optimizing a Transformer-based network for a deep learning seismic processing workflow
R. Harsuko
T. Alkhalifah
30
9
0
09 Aug 2023
Sparse Binary Transformers for Multivariate Time Series Modeling
Matt Gorbett
Hossein Shirazi
I. Ray
AI4TS
37
13
0
09 Aug 2023
RCMHA: Relative Convolutional Multi-Head Attention for Natural Language Modelling
Herman Sugiharto
Aradea
H. Mubarok
16
0
0
07 Aug 2023
ConvFormer: Revisiting Transformer for Sequential User Modeling
Hao Wang
Jianxun Lian
Mingyang Wu
Haoxuan Li
Jiajun Fan
Wanyue Xu
Chaozhuo Li
Xing Xie
24
3
0
05 Aug 2023
DeDrift: Robust Similarity Search under Content Drift
Dmitry Baranchuk
Matthijs Douze
Yash Upadhyay
I. Z. Yalniz
29
8
0
05 Aug 2023
Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment
Kun Yuan
Zishang Kong
Chuanchuan Zheng
Ming-Ting Sun
Xingsen Wen
ViT
40
14
0
31 Jul 2023
RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects
Sascha Kirch
Valeria Olyunina
Jan Ondřej
Rafael Pagés
Sergio Martín
Clara Pérez-Molina
33
2
0
29 Jul 2023
Improving Social Media Popularity Prediction with Multiple Post Dependencies
Zhizhen Zhang
Xiao-Zhu Xie
Meng Yang
Ye Tian
Yong-jia Jiang
Yong Cui
29
5
0
28 Jul 2023
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
T. Kajitsuka
Issei Sato
41
16
0
26 Jul 2023
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Tri Dao
LRM
44
1,172
0
17 Jul 2023
Fast Quantum Algorithm for Attention Computation
Yeqi Gao
Zhao Song
Xin Yang
Ruizhe Zhang
LRM
39
22
0
16 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
54
63
0
16 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
55
19
0
13 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
31
6
0
12 Jul 2023
ReLoRA: High-Rank Training Through Low-Rank Updates
Vladislav Lialin
Namrata Shivagunde
Sherin Muckatira
Anna Rumshisky
BDL
37
96
0
11 Jul 2023
Lost in the Middle: How Language Models Use Long Contexts
Nelson F. Liu
Kevin Lin
John Hewitt
Ashwin Paranjape
Michele Bevilacqua
Fabio Petroni
Percy Liang
RALM
45
1,452
0
06 Jul 2023
Scaling In-Context Demonstrations with Structured Attention
Tianle Cai
Kaixuan Huang
Jason D. Lee
Mengdi Wang
LRM
41
8
0
05 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
46
152
0
05 Jul 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
35
8
0
05 Jul 2023
Sumformer: Universal Approximation for Efficient Transformers
Silas Alberti
Niclas Dern
L. Thesing
Gitta Kutyniok
27
16
0
05 Jul 2023
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural Network
Zizhuo Li
Jiayi Ma
39
2
0
04 Jul 2023
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Yujia Xiao
Shaofei Zhang
Xi Wang
Xuejiao Tan
Lei He
Sheng Zhao
Frank Soong
Tan Lee
32
5
0
03 Jul 2023
Extending Context Window of Large Language Models via Positional Interpolation
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
48
497
0
27 Jun 2023
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
Daya Guo
Canwen Xu
Nan Duan
Jian Yin
Julian McAuley
20
80
0
26 Jun 2023
A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-Attention
M. A. José
Fabio Gagliardi Cozman
26
3
0
25 Jun 2023
LightGlue: Local Feature Matching at Light Speed
Philipp Lindenberger
Paul-Edouard Sarlin
Marc Pollefeys
3DV
VLM
41
401
0
23 Jun 2023
Efficient Online Processing with Deep Neural Networks
Lukas Hedegaard
36
0
0
23 Jun 2023
Constant Memory Attention Block
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Yoshua Bengio
Mohamed Osama Ahmed
35
0
0
21 Jun 2023
Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
26
3
0
21 Jun 2023
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren
Yang Liu
Shuohang Wang
Yichong Xu
Chenguang Zhu
Chengxiang Zhai
63
13
0
19 Jun 2023
Block-State Transformers
Mahan Fathi
Jonathan Pilault
Orhan Firat
C. Pal
Pierre-Luc Bacon
Ross Goroshin
47
17
0
15 Jun 2023
GCformer: An Efficient Framework for Accurate and Scalable Long-Term Multivariate Time Series Forecasting
Yanjun Zhao
Ziqing Ma
Tian Zhou
Liang Sun
M. Ye
Yi Qian
AI4TS
43
22
0
14 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization
Sehoon Kim
Coleman Hooper
A. Gholami
Zhen Dong
Xiuyu Li
Sheng Shen
Michael W. Mahoney
Kurt Keutzer
MQ
38
168
0
13 Jun 2023
Augmenting Language Models with Long-Term Memory
Weizhi Wang
Li Dong
Hao Cheng
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
KELM
RALM
46
84
0
12 Jun 2023
Revisiting Token Pruning for Object Detection and Instance Segmentation
Yifei Liu
Mathias Gehrig
Nico Messikommer
Marco Cannici
Davide Scaramuzza
ViT
VLM
50
25
0
12 Jun 2023
E
(
2
)
E(2)
E
(
2
)
-Equivariant Vision Transformer
Renjun Xu
Kaifan Yang
Ke Liu
Fengxiang He
ViT
MDE
23
10
0
11 Jun 2023
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Haoran You
Huihong Shi
Yipin Guo
Yingyan Lin
Lin
37
16
0
10 Jun 2023
Lightweight Monocular Depth Estimation via Token-Sharing Transformer
Dong-Jae Lee
Jae Young Lee
Hyounguk Shon
Eojindl Yi
Yeong-Hun Park
Sung-Jin Cho
Junmo Kim
ViT
MDE
14
4
0
09 Jun 2023
Multi-level Multiple Instance Learning with Transformer for Whole Slide Image Classification
Rui-qi Zhang
Qiaozheng Zhang
Yingzhuang Liu
Hao Xin
Yang Liu
Xinggang Wang
ViT
MedIm
47
8
0
08 Jun 2023
Recovering Simultaneously Structured Data via Non-Convex Iteratively Reweighted Least Squares
C. Kümmerle
J. Maly
27
1
0
08 Jun 2023
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Junda Wu
Tong Yu
Rui Wang
Zhao Song
Ruiyi Zhang
Handong Zhao
Chaochao Lu
Shuai Li
Ricardo Henao
VLM
44
23
0
08 Jun 2023
An Efficient Transformer for Simultaneous Learning of BEV and Lane Representations in 3D Lane Detection
Ziye Chen
K. Smith‐Miles
Bo Du
G. Qian
Biwei Huang
ViT
36
8
0
08 Jun 2023
GAT-GAN : A Graph-Attention-based Time-Series Generative Adversarial Network
Srikrishna Iyer
Teck-Hou Teng
AI4TS
26
1
0
03 Jun 2023
Previous
1
2
3
...
8
9
10
...
19
20
21
Next