ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04768
  4. Cited By
Linformer: Self-Attention with Linear Complexity

Linformer: Self-Attention with Linear Complexity

8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
ArXivPDFHTML

Papers citing "Linformer: Self-Attention with Linear Complexity"

50 / 1,050 papers shown
Title
Mitigating Bias in Visual Transformers via Targeted Alignment
Mitigating Bias in Visual Transformers via Targeted Alignment
Sruthi Sudhakar
Viraj Prabhu
Arvindkumar Krishnakumar
Judy Hoffman
30
8
0
08 Feb 2023
Efficient Joint Learning for Clinical Named Entity Recognition and
  Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events
Efficient Joint Learning for Clinical Named Entity Recognition and Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events
Anthony Yazdani
D. Proios
H. Rouhizadeh
Douglas Teodoro
21
7
0
08 Feb 2023
Single Cells Are Spatial Tokens: Transformers for Spatial Transcriptomic
  Data Imputation
Single Cells Are Spatial Tokens: Transformers for Spatial Transcriptomic Data Imputation
Haifang Wen
Wenzhuo Tang
Wei Jin
Jiayuan Ding
Renming Liu
Xinnan Dai
Feng Shi
Lulu Shang
Jiliang Tang
Yuying Xie
32
9
0
06 Feb 2023
Mnemosyne: Learning to Train Transformers with Transformers
Mnemosyne: Learning to Train Transformers with Transformers
Deepali Jain
K. Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan
OffRL
49
9
0
02 Feb 2023
Attention Link: An Efficient Attention-Based Low Resource Machine
  Translation Architecture
Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture
Zeping Min
19
0
0
01 Feb 2023
Exploring Attention Map Reuse for Efficient Transformer Neural Networks
Exploring Attention Map Reuse for Efficient Transformer Neural Networks
Kyuhong Shim
Jungwook Choi
Wonyong Sung
ViT
28
3
0
29 Jan 2023
On the Connection Between MPNN and Graph Transformer
On the Connection Between MPNN and Graph Transformer
Chen Cai
Truong-Son Hy
Rose Yu
Yusu Wang
49
53
0
27 Jan 2023
Deep Quantum Error Correction
Deep Quantum Error Correction
Yoni Choukroun
Lior Wolf
29
8
0
27 Jan 2023
Effective End-to-End Vision Language Pretraining with Semantic Visual
  Loss
Effective End-to-End Vision Language Pretraining with Semantic Visual Loss
Xiaofeng Yang
Fayao Liu
Guosheng Lin
VLM
26
7
0
18 Jan 2023
Self-Attention Amortized Distributional Projection Optimization for
  Sliced Wasserstein Point-Cloud Reconstruction
Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction
Khai Nguyen
Dang Nguyen
N. Ho
37
9
0
12 Jan 2023
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Haoxin Li
Phillip Keung
Daniel Cheng
Jungo Kasai
Noah A. Smith
25
3
0
11 Jan 2023
Dynamic Grained Encoder for Vision Transformers
Dynamic Grained Encoder for Vision Transformers
Lin Song
Songyang Zhang
Songtao Liu
Zeming Li
Xuming He
Hongbin Sun
Jian Sun
Nanning Zheng
ViT
26
34
0
10 Jan 2023
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
40
5
0
06 Jan 2023
Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person
  Re-identification
Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person Re-identification
Ziyi Tang
Ruimao Zhang
Zhanglin Peng
Jinrui Chen
Liang Lin
38
18
0
02 Jan 2023
Robust representations of oil wells' intervals via sparse attention
  mechanism
Robust representations of oil wells' intervals via sparse attention mechanism
Alina Rogulina
N. Baramiia
Valerii Kornilov
Sergey Petrakov
Alexey Zaytsev
AI4TS
OOD
19
1
0
29 Dec 2022
Cramming: Training a Language Model on a Single GPU in One Day
Cramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping
Tom Goldstein
MoE
32
86
0
28 Dec 2022
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Daniel Y. Fu
Tri Dao
Khaled Kamal Saab
A. Thomas
Atri Rudra
Christopher Ré
78
372
0
28 Dec 2022
A Length-Extrapolatable Transformer
A Length-Extrapolatable Transformer
Yutao Sun
Li Dong
Barun Patra
Shuming Ma
Shaohan Huang
Alon Benhaim
Vishrav Chaudhary
Xia Song
Furu Wei
35
116
0
20 Dec 2022
Efficient Long Sequence Modeling via State Space Augmented Transformer
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
130
36
0
15 Dec 2022
Full Contextual Attention for Multi-resolution Transformers in Semantic
  Segmentation
Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation
Loic Themyr
Clément Rambour
Nicolas Thome
Toby Collins
Alexandre Hostettler
ViT
27
10
0
15 Dec 2022
Temporal Saliency Detection Towards Explainable Transformer-based
  Timeseries Forecasting
Temporal Saliency Detection Towards Explainable Transformer-based Timeseries Forecasting
Nghia Duong-Trung
Kiran Madhusudhanan
Danh Le-Phuoc
AI4TS
43
4
0
15 Dec 2022
UNETR++: Delving into Efficient and Accurate 3D Medical Image
  Segmentation
UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming-Hsuan Yang
Fahad Shahbaz Khan
MedIm
40
131
0
08 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based
  Conformer for Speech Recognition
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Yu Pan
Jingjing Yin
Heng Lu
34
3
0
05 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of
  the art analysis
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
40
21
0
01 Dec 2022
Lightweight Structure-Aware Attention for Visual Understanding
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
31
2
0
29 Nov 2022
Survey on Self-Supervised Multimodal Representation Learning and
  Foundation Models
Survey on Self-Supervised Multimodal Representation Learning and Foundation Models
Sushil Thapa
AI4TS
SSL
20
1
0
29 Nov 2022
Dynamic Feature Pruning and Consolidation for Occluded Person
  Re-Identification
Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification
Yuteng Ye
Hang Zhou
Jiale Cai
Chenxing Gao
Youjia Zhang
Junle Wang
Qiang Hu
Junqing Yu
Wei Yang
36
6
0
27 Nov 2022
Deep representation learning: Fundamentals, Perspectives, Applications,
  and Open Challenges
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaML
AI4TS
40
6
0
27 Nov 2022
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision
  Transformer with Heterogeneous Attention
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention
Wenyuan Zeng
Meng Li
Wenjie Xiong
Tong Tong
Wen-jie Lu
Jin Tan
Runsheng Wang
Ru Huang
29
21
0
25 Nov 2022
Adaptive Attention Link-based Regularization for Vision Transformers
Adaptive Attention Link-based Regularization for Vision Transformers
Heegon Jin
Jongwon Choi
ViT
22
0
0
25 Nov 2022
A Self-Attention Ansatz for Ab-initio Quantum Chemistry
A Self-Attention Ansatz for Ab-initio Quantum Chemistry
Ingrid von Glehn
J. Spencer
David Pfau
33
61
0
24 Nov 2022
DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention
DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention
Bosheng Qin
Juncheng Li
Siliang Tang
Yueting Zhuang
25
2
0
24 Nov 2022
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek
T. Peters
Jan Dirk Wegner
Konrad Schindler
DiffM
30
12
0
23 Nov 2022
Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative
  Latent Attention
Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention
Zineng Tang
Jaemin Cho
Jie Lei
Joey Tianyi Zhou
VLM
24
9
0
21 Nov 2022
Discovering Evolution Strategies via Meta-Black-Box Optimization
Discovering Evolution Strategies via Meta-Black-Box Optimization
R. T. Lange
Tom Schaul
Yutian Chen
Tom Zahavy
Valenti Dallibard
Chris Xiaoxuan Lu
Satinder Singh
Sebastian Flennerhag
51
47
0
21 Nov 2022
Castling-ViT: Compressing Self-Attention via Switching Towards
  Linear-Angular Attention at Vision Transformer Inference
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
Haoran You
Yunyang Xiong
Xiaoliang Dai
Bichen Wu
Peizhao Zhang
Haoqi Fan
Peter Vajda
Yingyan Lin
37
32
0
18 Nov 2022
Efficient Transformers with Dynamic Token Pooling
Efficient Transformers with Dynamic Token Pooling
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
Edoardo Ponti
22
42
0
17 Nov 2022
Learning to Kindle the Starlight
Learning to Kindle the Starlight
Yu Yuan
Jiaqi Wu
Lindong Wang
Zhongliang Jing
H. Leung
Shuyuan Zhu
Han Pan
DiffM
26
3
0
16 Nov 2022
Token Turing Machines
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
29
21
0
16 Nov 2022
Language models are good pathologists: using attention-based sequence
  reduction and text-pretrained transformers for efficient WSI classification
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLM
MedIm
36
3
0
14 Nov 2022
CXTrack: Improving 3D Point Cloud Tracking with Contextual Information
CXTrack: Improving 3D Point Cloud Tracking with Contextual Information
Tianhan Xu
Yuanchen Guo
Yunyu Lai
Songiie Zhang
3DPC
ViT
21
32
0
12 Nov 2022
Demystify Transformers & Convolutions in Modern Image Deep Networks
Demystify Transformers & Convolutions in Modern Image Deep Networks
Jifeng Dai
Min Shi
Weiyun Wang
Sitong Wu
Linjie Xing
...
Lewei Lu
Jie Zhou
Xiaogang Wang
Yu Qiao
Xiao-hua Hu
ViT
34
11
0
10 Nov 2022
InternImage: Exploring Large-Scale Vision Foundation Models with
  Deformable Convolutions
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
VLM
62
661
0
10 Nov 2022
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision
  Transformer Acceleration with a Linear Taylor Attention
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Jyotikrishna Dass
Shang Wu
Huihong Shi
Chaojian Li
Zhifan Ye
Zhongfeng Wang
Yingyan Lin
20
53
0
09 Nov 2022
Efficient Joint Detection and Multiple Object Tracking with Spatially
  Aware Transformer
Efficient Joint Detection and Multiple Object Tracking with Spatially Aware Transformer
S. S. Nijhawan
Leo Hoshikawa
Atsushi Irie
Masakazu Yoshimura
Junji Otsuka
Takeshi Ohashi
VOT
ViT
42
0
0
09 Nov 2022
Linear Self-Attention Approximation via Trainable Feedforward Kernel
Linear Self-Attention Approximation via Trainable Feedforward Kernel
Uladzislau Yorsh
Alexander Kovalenko
35
0
0
08 Nov 2022
How Much Does Attention Actually Attend? Questioning the Importance of
  Attention in Pretrained Transformers
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
Michael Hassid
Hao Peng
Daniel Rotem
Jungo Kasai
Ivan Montero
Noah A. Smith
Roy Schwartz
34
25
0
07 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
31
55
0
07 Nov 2022
A Transformer Architecture for Online Gesture Recognition of
  Mathematical Expressions
A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions
Mirco Ramo
Guénolé Silvestre
19
1
0
04 Nov 2022
QNet: A Quantum-native Sequence Encoder Architecture
QNet: A Quantum-native Sequence Encoder Architecture
Wei-Yen Day
Hao-Sheng Chen
Min Sun
26
0
0
31 Oct 2022
Previous
123...111213...192021
Next