ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.01243
  4. Cited By
Efficient Attention: Attention with Linear Complexities

Efficient Attention: Attention with Linear Complexities

4 December 2018
Zhuoran Shen
Mingyuan Zhang
Haiyu Zhao
Shuai Yi
Hongsheng Li
ArXivPDFHTML

Papers citing "Efficient Attention: Attention with Linear Complexities"

43 / 93 papers shown
Title
ToDD: Topological Compound Fingerprinting in Computer-Aided Drug
  Discovery
ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery
Andac Demir
Baris Coskunuzer
I. Segovia-Dominguez
Yuzhou Chen
Yulia R. Gel
B. Kiziltan
18
16
0
07 Nov 2022
Token Merging: Your ViT But Faster
Token Merging: Your ViT But Faster
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
51
417
0
17 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
43
9
0
14 Oct 2022
Multi-Field De-interlacing using Deformable Convolution Residual Blocks
  and Self-Attention
Multi-Field De-interlacing using Deformable Convolution Residual Blocks and Self-Attention
Ronglei Ji
A. Murat Tekalp
SupR
30
2
0
21 Sep 2022
Attention Enhanced Citrinet for Speech Recognition
Attention Enhanced Citrinet for Speech Recognition
Xianchao Wu
13
1
0
01 Sep 2022
Deep Sparse Conformer for Speech Recognition
Deep Sparse Conformer for Speech Recognition
Xianchao Wu
20
2
0
01 Sep 2022
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
Mingyuan Zhang
Zhongang Cai
Liang Pan
Fangzhou Hong
Xinying Guo
Lei Yang
Ziwei Liu
DiffM
VGen
58
541
0
31 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention
  and Its Linearization
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
32
9
0
01 Aug 2022
QSAN: A Near-term Achievable Quantum Self-Attention Network
QSAN: A Near-term Achievable Quantum Self-Attention Network
Jinjing Shi
Ren-Xin Zhao
Wenxuan Wang
Shenmin Zhang
Xuelong Li
21
20
0
14 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
43
189
0
06 Jul 2022
Divert More Attention to Vision-Language Tracking
Divert More Attention to Vision-Language Tracking
Mingzhe Guo
Zhipeng Zhang
Heng Fan
Li Jing
29
53
0
03 Jul 2022
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Qihang Yu
Huiyu Wang
Dahun Kim
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
MedIm
32
90
0
17 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
21
25
0
17 Jun 2022
Dynamic Linear Transformer for 3D Biomedical Image Segmentation
Dynamic Linear Transformer for 3D Biomedical Image Segmentation
Zheyu Zhang
Ulas Bagci
ViT
MedIm
28
12
0
01 Jun 2022
Fair Comparison between Efficient Attentions
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
22
1
0
01 Jun 2022
HCFormer: Unified Image Segmentation with Hierarchical Clustering
HCFormer: Unified Image Segmentation with Hierarchical Clustering
Teppei Suzuki
27
0
0
20 May 2022
Unraveling Attention via Convex Duality: Analysis and Interpretations of
  Vision Transformers
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers
Arda Sahiner
Tolga Ergen
Batu Mehmet Ozturkler
John M. Pauly
Morteza Mardani
Mert Pilanci
40
33
0
17 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
44
149
0
27 Apr 2022
Efficient Linear Attention for Fast and Accurate Keypoint Matching
Efficient Linear Attention for Fast and Accurate Keypoint Matching
Suwichaya Suwanwimolkul
S. Komorita
3DPC
3DV
19
11
0
16 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
MatchFormer: Interleaving Attention in Transformers for Feature Matching
MatchFormer: Interleaving Attention in Transformers for Feature Matching
Qing Wang
Jiaming Zhang
Kailun Yang
Kunyu Peng
Rainer Stiefelhagen
ViT
44
141
0
17 Mar 2022
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with
  Transformers
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Jiaming Zhang
Huayao Liu
Kailun Yang
Xinxin Hu
Ruiping Liu
Rainer Stiefelhagen
ViT
34
301
0
09 Mar 2022
HDNet: High-resolution Dual-domain Learning for Spectral Compressive
  Imaging
HDNet: High-resolution Dual-domain Learning for Spectral Compressive Imaging
Xiaowan Hu
Yuanhao Cai
Jing Lin
Haoqian Wang
X. Yuan
Yulun Zhang
Radu Timofte
Luc Van Gool
37
134
0
04 Mar 2022
Self-attention Does Not Need $O(n^2)$ Memory
Self-attention Does Not Need O(n2)O(n^2)O(n2) Memory
M. Rabe
Charles Staats
LRM
23
139
0
10 Dec 2021
Spectral Transform Forms Scalable Transformer
Spectral Transform Forms Scalable Transformer
Bingxin Zhou
Xinliang Liu
Yuehua Liu
Yunyin Huang
Pietro Lió
Yuguang Wang
52
6
0
15 Nov 2021
A Multi-attribute Controllable Generative Model for Histopathology Image
  Synthesis
A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis
Jiarong Ye
Yuan Xue
Peter Liu
R. Zaino
K. Cheng
Xiaolei Huang
MedIm
30
8
0
10 Nov 2021
A Deep Generative Model for Reordering Adjacency Matrices
A Deep Generative Model for Reordering Adjacency Matrices
Oh-Hyun Kwon
Chiun-How Kao
Chun-Houh Chen
K. Ma
25
6
0
11 Oct 2021
Classification of hierarchical text using geometric deep learning: the
  case of clinical trials corpus
Classification of hierarchical text using geometric deep learning: the case of clinical trials corpus
Sohrab Ferdowsi
Nikolay Borissov
J. Knafou
P. Amini
Douglas Teodoro
16
7
0
04 Oct 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
114
20
0
29 Sep 2021
Anchor DETR: Query Design for Transformer-Based Object Detection
Anchor DETR: Query Design for Transformer-Based Object Detection
Yingming Wang
Xinming Zhang
Tong Yang
Jian Sun
ViT
16
53
0
15 Sep 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer
  Models via Low-Rank Approximation
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
26
12
0
24 Aug 2021
CSDI: Conditional Score-based Diffusion Models for Probabilistic Time
  Series Imputation
CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation
Y. Tashiro
Jiaming Song
Yang Song
Stefano Ermon
BDL
DiffM
22
515
0
07 Jul 2021
Polarized Self-Attention: Towards High-quality Pixel-wise Regression
Polarized Self-Attention: Towards High-quality Pixel-wise Regression
Huajun Liu
Fuqiang Liu
Xinyi Fan
Dong Huang
79
211
0
02 Jul 2021
XCiT: Cross-Covariance Image Transformers
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
42
499
0
17 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
49
1,167
0
09 Jun 2021
Choose a Transformer: Fourier or Galerkin
Choose a Transformer: Fourier or Galerkin
Shuhao Cao
42
225
0
31 May 2021
Relative Positional Encoding for Transformers with Linear Complexity
Relative Positional Encoding for Transformers with Linear Complexity
Antoine Liutkus
Ondřej Cífka
Shih-Lun Wu
Umut Simsekli
Yi-Hsuan Yang
Gaël Richard
33
44
0
18 May 2021
PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with
  Many Symbols
PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols
Aaron Courville
Yanpeng Zhao
Kewei Tu
23
22
0
28 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
38
2,176
0
20 Apr 2021
Linear Transformers Are Secretly Fast Weight Programmers
Linear Transformers Are Secretly Fast Weight Programmers
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
40
224
0
22 Feb 2021
Multi-stage Attention ResU-Net for Semantic Segmentation of
  Fine-Resolution Remote Sensing Images
Multi-stage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images
Rui Li
Shunyi Zheng
Chenxi Duan
Jianlin Su
Ce Zhang
30
187
0
29 Nov 2020
Sparsifying Transformer Models with Trainable Representation Pooling
Sparsifying Transformer Models with Trainable Representation Pooling
Michal Pietruszka
Łukasz Borchmann
Lukasz Garncarek
17
10
0
10 Sep 2020
Cross Attention Network for Few-shot Classification
Cross Attention Network for Few-shot Classification
Rui Hou
Hong Chang
Bingpeng Ma
Shiguang Shan
Xilin Chen
204
631
0
17 Oct 2019
Previous
12