ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04768
  4. Cited By
Linformer: Self-Attention with Linear Complexity

Linformer: Self-Attention with Linear Complexity

8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
ArXivPDFHTML

Papers citing "Linformer: Self-Attention with Linear Complexity"

50 / 1,050 papers shown
Title
Beyond the Limits: A Survey of Techniques to Extend the Context Length
  in Large Language Models
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
Xindi Wang
Mahsa Salmani
Parsa Omidi
Xiangyu Ren
Mehdi Rezagholizadeh
A. Eshaghi
LRM
39
36
0
03 Feb 2024
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian
  Processes
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Yingyi Chen
Qinghua Tao
F. Tonin
Johan A. K. Suykens
27
1
0
02 Feb 2024
Sequence Shortening for Context-Aware Machine Translation
Sequence Shortening for Context-Aware Machine Translation
Paweł Mąka
Yusuf Can Semerci
Jan Scholtes
Gerasimos Spanakis
22
2
0
02 Feb 2024
A Manifold Representation of the Key in Vision Transformers
A Manifold Representation of the Key in Vision Transformers
Li Meng
Morten Goodwin
Anis Yazidi
P. Engelstad
34
0
0
01 Feb 2024
Computation and Parameter Efficient Multi-Modal Fusion Transformer for
  Cued Speech Recognition
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition
Lei Liu
Li Liu
Haizhou Li
29
6
0
31 Jan 2024
SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design
SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design
Seokju Yun
Youngmin Ro
ViT
49
29
0
29 Jan 2024
A Comprehensive Survey of Compression Algorithms for Language Models
A Comprehensive Survey of Compression Algorithms for Language Models
Seungcheol Park
Jaehyeon Choi
Sojin Lee
U. Kang
MQ
39
12
0
27 Jan 2024
CascadedGaze: Efficiency in Global Context Extraction for Image
  Restoration
CascadedGaze: Efficiency in Global Context Extraction for Image Restoration
Amirhosein Ghasemabadi
Muhammad Kamran Janjua
Mohammad Salameh
Chunhua Zhou
Fengyu Sun
Di Niu
44
11
0
26 Jan 2024
Do deep neural networks utilize the weight space efficiently?
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
24
0
0
26 Jan 2024
SGTR+: End-to-end Scene Graph Generation with Transformer
SGTR+: End-to-end Scene Graph Generation with Transformer
Rongjie Li
Songyang Zhang
Xuming He
ViT
44
2
0
23 Jan 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
67
5
0
22 Jan 2024
With Greater Text Comes Greater Necessity: Inference-Time Training Helps
  Long Text Generation
With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation
Y. Wang
D. Ma
D. Cai
RALM
49
19
0
21 Jan 2024
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre
  Memory Units
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units
Zeyu Liu
Gourav Datta
Anni Li
Peter A. Beerel
40
9
0
20 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
55
719
0
17 Jan 2024
The What, Why, and How of Context Length Extension Techniques in Large
  Language Models -- A Detailed Survey
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Aman Chadha
Amitava Das
42
28
0
15 Jan 2024
Extending LLMs' Context Window with 100 Samples
Extending LLMs' Context Window with 100 Samples
Yikai Zhang
Junlong Li
Pengfei Liu
39
11
0
13 Jan 2024
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Jiaheng Liu
Zhiqi Bai
Yuanxing Zhang
Chenchen Zhang
Yu Zhang
...
Wenbo Su
Tiezheng Ge
Jie Fu
Wenhu Chen
Bo Zheng
48
8
0
13 Jan 2024
Transformers are Multi-State RNNs
Transformers are Multi-State RNNs
Matanel Oren
Michael Hassid
Nir Yarden
Yossi Adi
Roy Schwartz
OffRL
34
37
0
11 Jan 2024
Efficient Vision-and-Language Pre-training with Text-Relevant Image
  Patch Selection
Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
Wei Ye
Chaoya Jiang
Haiyang Xu
Chenhao Ye
Chenliang Li
Mingshi Yan
Shikun Zhang
Songhang Huang
Fei Huang
VLM
39
0
0
11 Jan 2024
Efficient Image Deblurring Networks based on Diffusion Models
Efficient Image Deblurring Networks based on Diffusion Models
Kang Chen
Yuanjie Liu
DiffM
24
2
0
11 Jan 2024
Towards Real-World Aerial Vision Guidance with Categorical 6D Pose
  Tracker
Towards Real-World Aerial Vision Guidance with Categorical 6D Pose Tracker
Jingtao Sun
Yaonan Wang
Danwei Wang
35
1
0
09 Jan 2024
SeTformer is What You Need for Vision and Language
SeTformer is What You Need for Vision and Language
Pourya Shamsolmoali
Masoumeh Zareapoor
Eric Granger
Michael Felsberg
49
4
0
07 Jan 2024
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using
  Neural ODE
A Cost-Efficient FPGA Implementation of Tiny Transformer Model using Neural ODE
Ikumi Okubo
Keisuke Sugiura
Hiroki Matsutani
36
2
0
05 Jan 2024
ScatterFormer: Efficient Voxel Transformer with Scattered Linear
  Attention
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
Chenhang He
Ruihuang Li
Guowen Zhang
Lei Zhang
40
5
0
01 Jan 2024
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
73
77
0
23 Dec 2023
Sign Language Production with Latent Motion Transformer
Sign Language Production with Latent Motion Transformer
Pan Xie
Taiying Peng
Yao Du
Qipeng Zhang
SLR
27
3
0
20 Dec 2023
Cached Transformers: Improving Transformers with Differentiable Memory
  Cache
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Zhaoyang Zhang
Wenqi Shao
Yixiao Ge
Xiaogang Wang
Liang Feng
Ping Luo
19
2
0
20 Dec 2023
Efficiency-oriented approaches for self-supervised speech representation
  learning
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
38
1
0
18 Dec 2023
Linear Attention via Orthogonal Memory
Linear Attention via Orthogonal Memory
Jun Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
42
3
0
18 Dec 2023
Zebra: Extending Context Window with Layerwise Grouped Local-Global
  Attention
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention
Kaiqiang Song
Xiaoyang Wang
Sangwoo Cho
Xiaoman Pan
Dong Yu
47
7
0
14 Dec 2023
Graph Convolutions Enrich the Self-Attention in Transformers!
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
48
4
0
07 Dec 2023
LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent
  Ecosystem
LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem
Yingqiang Ge
Yujie Ren
Wenyue Hua
Shuyuan Xu
Juntao Tan
Yongfeng Zhang
LLMAG
28
28
0
06 Dec 2023
DiffiT: Diffusion Vision Transformers for Image Generation
DiffiT: Diffusion Vision Transformers for Image Generation
Ali Hatamizadeh
Jiaming Song
Guilin Liu
Jan Kautz
Arash Vahdat
39
67
0
04 Dec 2023
Bootstrapping SparseFormers from Vision Foundation Models
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao
Zhan Tong
Kevin Qinghong Lin
Joya Chen
Mike Zheng Shou
41
0
0
04 Dec 2023
ImputeFormer: Low Rankness-Induced Transformers for Generalizable
  Spatiotemporal Imputation
ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
Tong Nie
Guoyang Qin
Wei Ma
Yuewen Mei
Jiangming Sun
AI4TS
AI4CE
31
26
0
04 Dec 2023
Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series
  Forecasting Approach
Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series Forecasting Approach
Jinguo Cheng
Ke Li
Keli Zhang
Lijun Sun
Junchi Yan
Yuankai Wu
AI4TS
38
2
0
04 Dec 2023
Token Fusion: Bridging the Gap between Token Pruning and Token Merging
Token Fusion: Bridging the Gap between Token Pruning and Token Merging
Minchul Kim
Shangqian Gao
Yen-Chang Hsu
Yilin Shen
Hongxia Jin
36
32
0
02 Dec 2023
Dimension Mixer: A Generalized Method for Structured Sparsity in Deep
  Neural Networks
Dimension Mixer: A Generalized Method for Structured Sparsity in Deep Neural Networks
Suman Sapkota
Binod Bhattarai
39
0
0
30 Nov 2023
Diffusion Models Without Attention
Diffusion Models Without Attention
Jing Nathan Yan
Jiatao Gu
Alexander M. Rush
40
61
0
30 Nov 2023
QuadraNet: Improving High-Order Neural Interaction Efficiency with
  Hardware-Aware Quadratic Neural Networks
QuadraNet: Improving High-Order Neural Interaction Efficiency with Hardware-Aware Quadratic Neural Networks
Chenhui Xu
Fuxun Yu
Zirui Xu
Chenchen Liu
Jinjun Xiong
Xiang Chen
40
4
0
29 Nov 2023
On the Long Range Abilities of Transformers
On the Long Range Abilities of Transformers
Itamar Zimerman
Lior Wolf
40
7
0
28 Nov 2023
One Pass Streaming Algorithm for Super Long Token Attention
  Approximation in Sublinear Space
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space
Raghav Addanki
Chenyang Li
Zhao Song
Chiwun Yang
55
3
0
24 Nov 2023
Linear Log-Normal Attention with Unbiased Concentration
Linear Log-Normal Attention with Unbiased Concentration
Yury Nahshan
Dor-Joseph Kampeas
E. Haleva
22
7
0
22 Nov 2023
Advancing Transformer Architecture in Long-Context Large Language
  Models: A Comprehensive Survey
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAG
KELM
49
56
0
21 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
48
4
0
21 Nov 2023
Zero redundancy distributed learning with differential privacy
Zero redundancy distributed learning with differential privacy
Zhiqi Bu
Justin Chiu
Ruixuan Liu
Sheng Zha
George Karypis
56
8
0
20 Nov 2023
LATIS: Lambda Abstraction-based Thermal Image Super-resolution
LATIS: Lambda Abstraction-based Thermal Image Super-resolution
Gargi Panda
Soumitra Kundu
Saumik Bhattacharya
Aurobinda Routray
43
0
0
18 Nov 2023
Sparse Attention-Based Neural Networks for Code Classification
Sparse Attention-Based Neural Networks for Code Classification
Ziyang Xiang
Zaixin Zhang
Qi Liu
20
0
0
11 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor
  Cores
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
46
29
0
10 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
48
10
0
09 Nov 2023
Previous
123...678...192021
Next