ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04768
  4. Cited By
Linformer: Self-Attention with Linear Complexity

Linformer: Self-Attention with Linear Complexity

8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
ArXivPDFHTML

Papers citing "Linformer: Self-Attention with Linear Complexity"

50 / 1,050 papers shown
Title
Transforming the Output of Generative Pre-trained Transformer: The
  Influence of the PGI Framework on Attention Dynamics
Transforming the Output of Generative Pre-trained Transformer: The Influence of the PGI Framework on Attention Dynamics
Aline Ioste
32
1
0
25 Aug 2023
Chunk, Align, Select: A Simple Long-sequence Processing Method for
  Transformers
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
Jiawen Xie
Pengyu Cheng
Xiao Liang
Yong Dai
Nan Du
47
7
0
25 Aug 2023
Easy attention: A simple attention mechanism for temporal predictions
  with transformers
Easy attention: A simple attention mechanism for temporal predictions with transformers
Marcial Sanchis-Agudo
Yuning Wang
Roger Arnau
L. Guastoni
Jasmin Lim
Karthik Duraisamy
Ricardo Vinuesa
AI4TS
24
0
0
24 Aug 2023
Enhancing Graph Transformers with Hierarchical Distance Structural
  Encoding
Enhancing Graph Transformers with Hierarchical Distance Structural Encoding
Yuan Luo
Hongkang Li
Lei Shi
Xiao-Ming Wu
40
7
0
22 Aug 2023
A Lightweight Transformer for Faster and Robust EBSD Data Collection
A Lightweight Transformer for Faster and Robust EBSD Data Collection
Harry Dong
S. Donegan
M. Shah
Yuejie Chi
34
2
0
18 Aug 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
47
3
0
18 Aug 2023
Memory-and-Anticipation Transformer for Online Action Understanding
Memory-and-Anticipation Transformer for Online Action Understanding
Jiahao Wang
Guo Chen
Yifei Huang
Liming Wang
Tong Lu
OffRL
67
37
0
15 Aug 2023
Optimizing a Transformer-based network for a deep learning seismic
  processing workflow
Optimizing a Transformer-based network for a deep learning seismic processing workflow
R. Harsuko
T. Alkhalifah
30
9
0
09 Aug 2023
Sparse Binary Transformers for Multivariate Time Series Modeling
Sparse Binary Transformers for Multivariate Time Series Modeling
Matt Gorbett
Hossein Shirazi
I. Ray
AI4TS
37
13
0
09 Aug 2023
RCMHA: Relative Convolutional Multi-Head Attention for Natural Language
  Modelling
RCMHA: Relative Convolutional Multi-Head Attention for Natural Language Modelling
Herman Sugiharto
Aradea
H. Mubarok
16
0
0
07 Aug 2023
ConvFormer: Revisiting Transformer for Sequential User Modeling
ConvFormer: Revisiting Transformer for Sequential User Modeling
Hao Wang
Jianxun Lian
Mingyang Wu
Haoxuan Li
Jiajun Fan
Wanyue Xu
Chaozhuo Li
Xing Xie
24
3
0
05 Aug 2023
DeDrift: Robust Similarity Search under Content Drift
DeDrift: Robust Similarity Search under Content Drift
Dmitry Baranchuk
Matthijs Douze
Yash Upadhyay
I. Z. Yalniz
29
8
0
05 Aug 2023
Capturing Co-existing Distortions in User-Generated Content for
  No-reference Video Quality Assessment
Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment
Kun Yuan
Zishang Kong
Chuanchuan Zheng
Ming-Ting Sun
Xingsen Wen
ViT
40
14
0
31 Jul 2023
RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects
RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects
Sascha Kirch
Valeria Olyunina
Jan Ondřej
Rafael Pagés
Sergio Martín
Clara Pérez-Molina
33
2
0
29 Jul 2023
Improving Social Media Popularity Prediction with Multiple Post
  Dependencies
Improving Social Media Popularity Prediction with Multiple Post Dependencies
Zhizhen Zhang
Xiao-Zhu Xie
Meng Yang
Ye Tian
Yong-jia Jiang
Yong Cui
29
5
0
28 Jul 2023
Are Transformers with One Layer Self-Attention Using Low-Rank Weight
  Matrices Universal Approximators?
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
T. Kajitsuka
Issei Sato
41
16
0
26 Jul 2023
FlashAttention-2: Faster Attention with Better Parallelism and Work
  Partitioning
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Tri Dao
LRM
44
1,172
0
17 Jul 2023
Fast Quantum Algorithm for Attention Computation
Fast Quantum Algorithm for Attention Computation
Yeqi Gao
Zhao Song
Xin Yang
Ruizhe Zhang
LRM
39
22
0
16 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
54
63
0
16 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action
  Recognition
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
55
19
0
13 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
31
6
0
12 Jul 2023
ReLoRA: High-Rank Training Through Low-Rank Updates
ReLoRA: High-Rank Training Through Low-Rank Updates
Vladislav Lialin
Namrata Shivagunde
Sherin Muckatira
Anna Rumshisky
BDL
37
96
0
11 Jul 2023
Lost in the Middle: How Language Models Use Long Contexts
Lost in the Middle: How Language Models Use Long Contexts
Nelson F. Liu
Kevin Lin
John Hewitt
Ashwin Paranjape
Michele Bevilacqua
Fabio Petroni
Percy Liang
RALM
45
1,452
0
06 Jul 2023
Scaling In-Context Demonstrations with Structured Attention
Scaling In-Context Demonstrations with Structured Attention
Tianle Cai
Kaixuan Huang
Jason D. Lee
Mengdi Wang
LRM
41
8
0
05 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
46
152
0
05 Jul 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
35
8
0
05 Jul 2023
Sumformer: Universal Approximation for Efficient Transformers
Sumformer: Universal Approximation for Efficient Transformers
Silas Alberti
Niclas Dern
L. Thesing
Gitta Kutyniok
27
16
0
05 Jul 2023
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
  Network
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural Network
Zizhuo Li
Jiayi Ma
39
2
0
04 Jul 2023
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph
  Reading
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Yujia Xiao
Shaofei Zhang
Xi Wang
Xuejiao Tan
Lei He
Sheng Zhao
Frank Soong
Tan Lee
32
5
0
03 Jul 2023
Extending Context Window of Large Language Models via Positional
  Interpolation
Extending Context Window of Large Language Models via Positional Interpolation
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
48
497
0
27 Jun 2023
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
Daya Guo
Canwen Xu
Nan Duan
Jian Yin
Julian McAuley
20
80
0
26 Jun 2023
A Multilingual Translator to SQL with Database Schema Pruning to Improve
  Self-Attention
A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-Attention
M. A. José
Fabio Gagliardi Cozman
26
3
0
25 Jun 2023
LightGlue: Local Feature Matching at Light Speed
LightGlue: Local Feature Matching at Light Speed
Philipp Lindenberger
Paul-Edouard Sarlin
Marc Pollefeys
3DV
VLM
41
401
0
23 Jun 2023
Efficient Online Processing with Deep Neural Networks
Efficient Online Processing with Deep Neural Networks
Lukas Hedegaard
36
0
0
23 Jun 2023
Constant Memory Attention Block
Constant Memory Attention Block
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Yoshua Bengio
Mohamed Osama Ahmed
35
0
0
21 Jun 2023
Investigating Pre-trained Language Models on Cross-Domain Datasets, a
  Step Closer to General AI
Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
26
3
0
21 Jun 2023
Sparse Modular Activation for Efficient Sequence Modeling
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren
Yang Liu
Shuohang Wang
Yichong Xu
Chenguang Zhu
Chengxiang Zhai
63
13
0
19 Jun 2023
Block-State Transformers
Block-State Transformers
Mahan Fathi
Jonathan Pilault
Orhan Firat
C. Pal
Pierre-Luc Bacon
Ross Goroshin
47
17
0
15 Jun 2023
GCformer: An Efficient Framework for Accurate and Scalable Long-Term
  Multivariate Time Series Forecasting
GCformer: An Efficient Framework for Accurate and Scalable Long-Term Multivariate Time Series Forecasting
Yanjun Zhao
Ziqing Ma
Tian Zhou
Liang Sun
M. Ye
Yi Qian
AI4TS
43
22
0
14 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization
SqueezeLLM: Dense-and-Sparse Quantization
Sehoon Kim
Coleman Hooper
A. Gholami
Zhen Dong
Xiuyu Li
Sheng Shen
Michael W. Mahoney
Kurt Keutzer
MQ
38
168
0
13 Jun 2023
Augmenting Language Models with Long-Term Memory
Augmenting Language Models with Long-Term Memory
Weizhi Wang
Li Dong
Hao Cheng
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
KELM
RALM
46
84
0
12 Jun 2023
Revisiting Token Pruning for Object Detection and Instance Segmentation
Revisiting Token Pruning for Object Detection and Instance Segmentation
Yifei Liu
Mathias Gehrig
Nico Messikommer
Marco Cannici
Davide Scaramuzza
ViT
VLM
50
25
0
12 Jun 2023
$E(2)$-Equivariant Vision Transformer
E(2)E(2)E(2)-Equivariant Vision Transformer
Renjun Xu
Kaifan Yang
Ke Liu
Fengxiang He
ViT
MDE
23
10
0
11 Jun 2023
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient
  Vision Transformer
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Haoran You
Huihong Shi
Yipin Guo
Yingyan Lin
Lin
37
16
0
10 Jun 2023
Lightweight Monocular Depth Estimation via Token-Sharing Transformer
Lightweight Monocular Depth Estimation via Token-Sharing Transformer
Dong-Jae Lee
Jae Young Lee
Hyounguk Shon
Eojindl Yi
Yeong-Hun Park
Sung-Jin Cho
Junmo Kim
ViT
MDE
14
4
0
09 Jun 2023
Multi-level Multiple Instance Learning with Transformer for Whole Slide
  Image Classification
Multi-level Multiple Instance Learning with Transformer for Whole Slide Image Classification
Rui-qi Zhang
Qiaozheng Zhang
Yingzhuang Liu
Hao Xin
Yang Liu
Xinggang Wang
ViT
MedIm
47
8
0
08 Jun 2023
Recovering Simultaneously Structured Data via Non-Convex Iteratively
  Reweighted Least Squares
Recovering Simultaneously Structured Data via Non-Convex Iteratively Reweighted Least Squares
C. Kümmerle
J. Maly
27
1
0
08 Jun 2023
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural
  Language Understanding
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Junda Wu
Tong Yu
Rui Wang
Zhao Song
Ruiyi Zhang
Handong Zhao
Chaochao Lu
Shuai Li
Ricardo Henao
VLM
44
23
0
08 Jun 2023
An Efficient Transformer for Simultaneous Learning of BEV and Lane
  Representations in 3D Lane Detection
An Efficient Transformer for Simultaneous Learning of BEV and Lane Representations in 3D Lane Detection
Ziye Chen
K. Smith‐Miles
Bo Du
G. Qian
Biwei Huang
ViT
36
8
0
08 Jun 2023
GAT-GAN : A Graph-Attention-based Time-Series Generative Adversarial
  Network
GAT-GAN : A Graph-Attention-based Time-Series Generative Adversarial Network
Srikrishna Iyer
Teck-Hou Teng
AI4TS
26
1
0
03 Jun 2023
Previous
123...8910...192021
Next