ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04768
  4. Cited By
Linformer: Self-Attention with Linear Complexity

Linformer: Self-Attention with Linear Complexity

8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
ArXivPDFHTML

Papers citing "Linformer: Self-Attention with Linear Complexity"

50 / 1,050 papers shown
Title
An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition
  using a Novel Transformers-based Model and an Innovative 270 Million-Words
  Multi-Font Corpus of Classical Arabic with Diacritics
An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics
Aly M. Kassem
Omar Mohamed
Ali Ashraf
Ahmed Elbehery
Salma Jamal
Anas Salah
A. Ghoneim
25
3
0
20 Aug 2022
Treeformer: Dense Gradient Trees for Efficient Attention Computation
Treeformer: Dense Gradient Trees for Efficient Attention Computation
Lovish Madaan
Srinadh Bhojanapalli
Himanshu Jain
Prateek Jain
35
6
0
18 Aug 2022
Learning with Local Gradients at the Edge
Learning with Local Gradients at the Edge
M. Lomnitz
Z. Daniels
David C. Zhang
M. Piacentino
34
1
0
17 Aug 2022
Recent Progress in Transformer-based Medical Image Analysis
Recent Progress in Transformer-based Medical Image Analysis
Zhao-cheng Liu
Qiujie Lv
Ziduo Yang
Yifan Li
Chau Hung Lee
Leizhao Shen
MedIm
52
58
0
13 Aug 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
26
2
0
11 Aug 2022
PatchDropout: Economizing Vision Transformers Using Patch Dropout
PatchDropout: Economizing Vision Transformers Using Patch Dropout
Yue Liu
Christos Matsoukas
Fredrik Strand
Hossein Azizpour
Kevin Smith
21
21
0
10 Aug 2022
Investigating Efficiently Extending Transformers for Long Input
  Summarization
Investigating Efficiently Extending Transformers for Long Input Summarization
Jason Phang
Yao-Min Zhao
Peter J. Liu
RALM
LLMAG
42
63
0
08 Aug 2022
Sparse Attentive Memory Network for Click-through Rate Prediction with
  Long Sequences
Sparse Attentive Memory Network for Click-through Rate Prediction with Long Sequences
Qianying Lin
Wen-Ji Zhou
Yanshi Wang
Qing Da
Qingguo Chen
Bing Wang
VLM
23
9
0
08 Aug 2022
FourCastNet: Accelerating Global High-Resolution Weather Forecasting
  using Adaptive Fourier Neural Operators
FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators
Thorsten Kurth
Shashank Subramanian
P. Harrington
Jaideep Pathak
Morteza Mardani
D. Hall
Andrea Miele
K. Kashinath
Anima Anandkumar
AI4Cl
40
174
0
08 Aug 2022
Global Hierarchical Attention for 3D Point Cloud Analysis
Global Hierarchical Attention for 3D Point Cloud Analysis
Dan Jia
Alexander Hermans
Bastian Leibe
3DPC
21
0
0
07 Aug 2022
Sublinear Time Algorithm for Online Weighted Bipartite Matching
Sublinear Time Algorithm for Online Weighted Bipartite Matching
Han Hu
Zhao Song
Runzhou Tao
Zhaozhuo Xu
Junze Yin
Danyang Zhuo
29
7
0
05 Aug 2022
Vision-Centric BEV Perception: A Survey
Vision-Centric BEV Perception: A Survey
Yuexin Ma
Tai Wang
Xuyang Bai
Huitong Yang
Yuenan Hou
Yaming Wang
Yu Qiao
Ruigang Yang
Tianyi Zhou
Xinge Zhu
66
130
0
04 Aug 2022
Efficient Long-Text Understanding with Short-Text Models
Efficient Long-Text Understanding with Short-Text Models
Maor Ivgi
Uri Shaham
Jonathan Berant
VLM
38
76
0
01 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention
  and Its Linearization
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
47
9
0
01 Aug 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated
  Convolutions
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
22
251
0
28 Jul 2022
Neural Architecture Search on Efficient Transformers and Beyond
Neural Architecture Search on Efficient Transformers and Beyond
Zexiang Liu
Dong Li
Kaiyue Lu
Zhen Qin
Weixuan Sun
Jiacheng Xu
Yiran Zhong
35
19
0
28 Jul 2022
Explain My Surprise: Learning Efficient Long-Term Memory by Predicting
  Uncertain Outcomes
Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes
A. Sorokin
N. Buzun
Leonid Pugachev
Andrey Kravchenko
31
8
0
27 Jul 2022
Rethinking Efficacy of Softmax for Lightweight Non-Local Neural Networks
Rethinking Efficacy of Softmax for Lightweight Non-Local Neural Networks
Yooshin Cho
Youngsoo Kim
Hanbyel Cho
Jaesung Ahn
H. Hong
Junmo Kim
6
3
0
27 Jul 2022
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot
  Segmentation
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation
Sunghwan Hong
Seokju Cho
Jisu Nam
Stephen Lin
Seung Wook Kim
ViT
29
123
0
22 Jul 2022
Multi Resolution Analysis (MRA) for Approximate Self-Attention
Multi Resolution Analysis (MRA) for Approximate Self-Attention
Zhanpeng Zeng
Sourav Pal
Jeffery Kline
G. Fung
Vikas Singh
23
6
0
21 Jul 2022
Single Stage Virtual Try-on via Deformable Attention Flows
Single Stage Virtual Try-on via Deformable Attention Flows
Shuai Bai
Huiling Zhou
Zhikang Li
Chang Zhou
Hongxia Yang
DiffM
3DH
16
81
0
19 Jul 2022
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Xiaokang Chen
Fangyun Wei
Gang Zeng
Jingdong Wang
ViT
30
33
0
18 Jul 2022
Multi-manifold Attention for Vision Transformers
Multi-manifold Attention for Vision Transformers
D. Konstantinidis
Ilias Papastratis
K. Dimitropoulos
P. Daras
ViT
27
16
0
18 Jul 2022
Lightweight Vision Transformer with Cross Feature Attention
Lightweight Vision Transformer with Cross Feature Attention
Youpeng Zhao
Huadong Tang
Yingying Jiang
A. Yong
Qiang Wu
ViT
30
10
0
15 Jul 2022
Recurrent Memory Transformer
Recurrent Memory Transformer
Aydar Bulatov
Yuri Kuratov
Andrey Kravchenko
CLL
15
103
0
14 Jul 2022
Rethinking Attention Mechanism in Time Series Classification
Rethinking Attention Mechanism in Time Series Classification
Bowen Zhao
Huanlai Xing
Xinhan Wang
Fuhong Song
Zhiwen Xiao
AI4TS
36
30
0
14 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
21
35
0
13 Jul 2022
Multi-Behavior Hypergraph-Enhanced Transformer for Sequential
  Recommendation
Multi-Behavior Hypergraph-Enhanced Transformer for Sequential Recommendation
Yuhao Yang
Chao Huang
Lianghao Xia
Keli Zhang
Yanwei Yu
Chenliang Li
HAI
18
121
0
12 Jul 2022
Horizontal and Vertical Attention in Transformers
Horizontal and Vertical Attention in Transformers
Litao Yu
Jing Zhang
ViT
25
1
0
10 Jul 2022
kMaX-DeepLab: k-means Mask Transformer
kMaX-DeepLab: k-means Mask Transformer
Qihang Yu
Huiyu Wang
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
40
18
0
08 Jul 2022
Beyond Transfer Learning: Co-finetuning for Action Localisation
Beyond Transfer Learning: Co-finetuning for Action Localisation
Anurag Arnab
Xuehan Xiong
A. Gritsenko
Rob Romijnders
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
38
8
0
08 Jul 2022
Device-Cloud Collaborative Recommendation via Meta Controller
Device-Cloud Collaborative Recommendation via Meta Controller
Jiangchao Yao
Feng Wang
Xichen Ding
Shaohu Chen
Bo Han
Jingren Zhou
Hongxia Yang
32
17
0
07 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
Don't Pay Attention to the Noise: Learning Self-supervised
  Representations of Light Curves with a Denoising Time Series Transformer
Don't Pay Attention to the Noise: Learning Self-supervised Representations of Light Curves with a Denoising Time Series Transformer
M. Morvan
N. Nikolaou
K. H. Yip
Ingo P. Waldmann
AI4TS
107
9
0
06 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
45
191
0
06 Jul 2022
Efficient Representation Learning via Adaptive Context Pooling
Efficient Representation Learning via Adaptive Context Pooling
Chen Huang
Walter A. Talbott
Navdeep Jaitly
J. Susskind
30
6
0
05 Jul 2022
Softmax-free Linear Transformers
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
21
7
0
05 Jul 2022
Compute Cost Amortized Transformer for Streaming ASR
Compute Cost Amortized Transformer for Streaming ASR
Yifan Xie
J. Macoskey
Martin H. Radfar
Feng-Ju Chang
Brian King
Ariya Rastrow
Athanasios Mouchtaris
Grant P. Strimel
30
7
0
05 Jul 2022
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Cheng-rong Li
Yangxin Liu
42
0
0
01 Jul 2022
Learning Functions on Multiple Sets using Multi-Set Transformers
Learning Functions on Multiple Sets using Multi-Set Transformers
Kira A. Selby
Ahmad Rashid
I. Kobyzev
Mehdi Rezagholizadeh
Pascal Poupart
ViT
30
1
0
30 Jun 2022
FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer
FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer
Jingping Liu
Yuqiu Song
Kui Xue
Hongli Sun
Chao Wang
Lihan Chen
Haiyun Jiang
Jiaqing Liang
Tong Ruan
41
2
0
30 Jun 2022
Bottleneck Low-rank Transformers for Low-resource Spoken Language
  Understanding
Bottleneck Low-rank Transformers for Low-resource Spoken Language Understanding
Pu Wang
Hugo Van hamme
VLM
34
4
0
28 Jun 2022
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech
  Separation
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation
Jian Luo
Jianzong Wang
Ning Cheng
Edward Xiao
Xulong Zhang
Jing Xiao
ViT
37
12
0
28 Jun 2022
Kernel Attention Transformer (KAT) for Histopathology Whole Slide Image
  Classification
Kernel Attention Transformer (KAT) for Histopathology Whole Slide Image Classification
Yushan Zheng
Jun Li
Jun Shi
Feng-ying Xie
Zhi-guo Jiang
ViT
MedIm
11
16
0
27 Jun 2022
Long Range Language Modeling via Gated State Spaces
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
39
232
0
27 Jun 2022
Representative Teacher Keys for Knowledge Distillation Model Compression
  Based on Attention Mechanism for Image Classification
Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification
Jun-Teng Yang
Sheng-Che Kao
S. Huang
16
0
0
26 Jun 2022
Vicinity Vision Transformer
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
44
32
0
21 Jun 2022
Resource-Efficient Separation Transformer
Resource-Efficient Separation Transformer
Luca Della Libera
Cem Subakan
Mirco Ravanelli
Samuele Cornell
Frédéric Lepoutre
François Grondin
VLM
48
16
0
19 Jun 2022
All you need is feedback: Communication with block attention feedback
  codes
All you need is feedback: Communication with block attention feedback codes
Emre Ozfatura
Yulin Shao
A. Perotti
B. Popović
Deniz Gunduz
22
10
0
19 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
44
32
0
19 Jun 2022
Previous
123...131415...192021
Next