ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.06732
  4. Cited By
Efficient Transformers: A Survey

Efficient Transformers: A Survey

14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
    VLM
ArXivPDFHTML

Papers citing "Efficient Transformers: A Survey"

50 / 633 papers shown
Title
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture
  of Experts
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLM
MoE
28
185
0
06 Jun 2022
Exploring Transformers for Behavioural Biometrics: A Case Study in Gait
  Recognition
Exploring Transformers for Behavioural Biometrics: A Case Study in Gait Recognition
Paula Delgado-Santos
Ruben Tolosana
R. Guest
F. Deravi
R. Vera-Rodríguez
32
30
0
03 Jun 2022
BayesFormer: Transformer with Uncertainty Estimation
BayesFormer: Transformer with Uncertainty Estimation
Karthik Abinav Sankararaman
Sinong Wang
Han Fang
UQCV
BDL
30
10
0
02 Jun 2022
Fair Comparison between Efficient Attentions
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
25
1
0
01 Jun 2022
Transformer with Fourier Integral Attentions
Transformer with Fourier Integral Attentions
T. Nguyen
Minh Pham
Tam Nguyen
Khai Nguyen
Stanley J. Osher
Nhat Ho
25
4
0
01 Jun 2022
Transformers for Multi-Object Tracking on Point Clouds
Transformers for Multi-Object Tracking on Point Clouds
Felicia Ruppel
F. Faion
Claudius Gläser
Klaus C. J. Dietmayer
3DPC
26
17
0
31 May 2022
Prompt Injection: Parameterization of Fixed Inputs
Prompt Injection: Parameterization of Fixed Inputs
Eunbi Choi
Yongrae Jo
Joel Jang
Minjoon Seo
18
29
0
31 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
78
2,045
0
27 May 2022
Probabilistic Transformer: Modelling Ambiguities and Distributions for
  RNA Folding and Molecule Design
Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design
Jörg Franke
Frederic Runge
Frank Hutter
17
14
0
27 May 2022
Training Language Models with Memory Augmentation
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
239
128
0
25 May 2022
ASSET: Autoregressive Semantic Scene Editing with Transformers at High
  Resolutions
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions
Difan Liu
Sandesh Shetty
Tobias Hinz
Matthew Fisher
Richard Y. Zhang
Taesung Park
E. Kalogerakis
ViT
27
30
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
137
350
0
21 May 2022
Exploring Extreme Parameter Compression for Pre-trained Language Models
Exploring Extreme Parameter Compression for Pre-trained Language Models
Yuxin Ren
Benyou Wang
Lifeng Shang
Xin Jiang
Qun Liu
33
18
0
20 May 2022
Transformer with Memory Replay
Transformer with Memory Replay
R. Liu
Barzan Mozafari
OffRL
70
4
0
19 May 2022
Text Detection & Recognition in the Wild for Robot Localization
Text Detection & Recognition in the Wild for Robot Localization
Z. Raisi
John S. Zelek
22
0
0
17 May 2022
Transkimmer: Transformer Learns to Layer-wise Skim
Transkimmer: Transformer Learns to Layer-wise Skim
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
80
38
0
15 May 2022
Symphony Generation with Permutation Invariant Language Model
Symphony Generation with Permutation Invariant Language Model
Jiafeng Liu
Yuanliang Dong
Zehua Cheng
Xinran Zhang
Xiaobing Li
Feng Yu
Maosong Sun
21
39
0
10 May 2022
Transformer-Empowered 6G Intelligent Networks: From Massive MIMO
  Processing to Semantic Communication
Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication
Yang Wang
Zhen Gao
Dezhi Zheng
Sheng Chen
Deniz Gündüz
H. Vincent Poor
AI4CE
19
83
0
08 May 2022
Transformers in Time-series Analysis: A Tutorial
Transformers in Time-series Analysis: A Tutorial
Sabeen Ahmed
Ian E. Nielsen
Aakash Tripathi
Shamoon Siddiqui
Ghulam Rasool
R. Ramachandran
AI4TS
36
142
0
28 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and
  Applications
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Yujun Lin
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
27
107
0
25 Apr 2022
ChapterBreak: A Challenge Dataset for Long-Range Language Models
ChapterBreak: A Challenge Dataset for Long-Range Language Models
Simeng Sun
Katherine Thai
Mohit Iyyer
18
19
0
22 Apr 2022
On the Locality of Attention in Direct Speech Translation
On the Locality of Attention in Direct Speech Translation
Belen Alastruey
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
10
7
0
19 Apr 2022
Exploring Dimensionality Reduction Techniques in Multilingual
  Transformers
Exploring Dimensionality Reduction Techniques in Multilingual Transformers
Álvaro Huertas-García
Alejandro Martín
Javier Huertas-Tato
David Camacho
29
7
0
18 Apr 2022
Usage of specific attention improves change point detection
Usage of specific attention improves change point detection
Anna Dmitrienko
Evgenia Romanenkova
Alexey Zaytsev
13
0
0
18 Apr 2022
Multi-Frame Self-Supervised Depth with Transformers
Multi-Frame Self-Supervised Depth with Transformers
Vitor Campagnolo Guizilini
Rares Andrei Ambrus
Di Chen
Sergey Zakharov
Adrien Gaidon
ViT
MDE
17
84
0
15 Apr 2022
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context
  NLP Models
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models
Phyllis Ang
Bhuwan Dhingra
Lisa Wu Wills
30
6
0
15 Apr 2022
Revisiting Transformer-based Models for Long Document Classification
Revisiting Transformer-based Models for Long Document Classification
Xiang Dai
Ilias Chalkidis
S. Darkner
Desmond Elliott
VLM
18
68
0
14 Apr 2022
Malceiver: Perceiver with Hierarchical and Multi-modal Features for
  Android Malware Detection
Malceiver: Perceiver with Hierarchical and Multi-modal Features for Android Malware Detection
Niall McLaughlin
30
2
0
12 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Linear Complexity Randomized Self-attention Mechanism
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
22
31
0
10 Apr 2022
BERTuit: Understanding Spanish language in Twitter through a native
  transformer
BERTuit: Understanding Spanish language in Twitter through a native transformer
Javier Huertas-Tato
Alejandro Martín
David Camacho
20
9
0
07 Apr 2022
Scaling Language Model Size in Cross-Device Federated Learning
Scaling Language Model Size in Cross-Device Federated Learning
Jae Hun Ro
Theresa Breiner
Lara McConnaughey
Mingqing Chen
A. Suresh
Shankar Kumar
Rajiv Mathews
FedML
26
24
0
31 Mar 2022
MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
Alan Baade
Puyuan Peng
David Harwath
25
95
0
30 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
57
292
0
27 Mar 2022
A General Survey on Attention Mechanisms in Deep Learning
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
31
296
0
27 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
27
28
0
24 Mar 2022
Linearizing Transformer with Key-Value Memory
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
22
5
0
23 Mar 2022
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger
Thomas Paniagua
Xi Song
Naresh P. Cuntoor
Mun Wai Lee
Tianfu Wu
ViT
15
7
0
22 Mar 2022
Mask Usage Recognition using Vision Transformer with Transfer Learning
  and Data Augmentation
Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation
Hensel Donato Jahja
N. Yudistira
Sutrisno
ViT
18
8
0
22 Mar 2022
Efficient Classification of Long Documents Using Transformers
Efficient Classification of Long Documents Using Transformers
Hyunji Hayley Park
Yogarshi Vyas
Kashif Shah
11
51
0
21 Mar 2022
Memorizing Transformers
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
30
173
0
16 Mar 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks
  with Unified Vision-and-Language BERTs
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
16
6
0
15 Mar 2022
Block-Recurrent Transformers
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
23
94
0
11 Mar 2022
WaveMix: Resource-efficient Token Mixing for Images
WaveMix: Resource-efficient Token Mixing for Images
Pranav Jeevan
A. Sethi
17
10
0
07 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
36
22
0
28 Feb 2022
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations
  for Benchmarking Retrieval-based Clinical Decision Support Systems
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems
Zhengyun Zhao
Qiao Jin
Fangyuan Chen
Tuorui Peng
Sheng Yu
PINN
19
34
0
28 Feb 2022
A Differential Attention Fusion Model Based on Transformer for Time
  Series Forecasting
A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting
Benhan Li
Shengdong Du
Tianrui Li
AI4TS
20
2
0
23 Feb 2022
Ligandformer: A Graph Neural Network for Predicting Compound Property
  with Robust Interpretation
Ligandformer: A Graph Neural Network for Predicting Compound Property with Robust Interpretation
Jinjiang Guo
Qi Liu
Han Guo
Xi Lu
AI4CE
24
3
0
21 Feb 2022
Deep Learning for Hate Speech Detection: A Comparative Study
Deep Learning for Hate Speech Detection: A Comparative Study
Jitendra Malik
Hezhe Qiao
Guansong Pang
Anton Van Den Hengel
51
44
0
19 Feb 2022
The NLP Task Effectiveness of Long-Range Transformers
The NLP Task Effectiveness of Long-Range Transformers
Guanghui Qin
Yukun Feng
Benjamin Van Durme
12
28
0
16 Feb 2022
Previous
123...1011121389
Next