Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.06732
Cited By
Efficient Transformers: A Survey
14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Transformers: A Survey"
50 / 633 papers shown
Title
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLM
MoE
28
185
0
06 Jun 2022
Exploring Transformers for Behavioural Biometrics: A Case Study in Gait Recognition
Paula Delgado-Santos
Ruben Tolosana
R. Guest
F. Deravi
R. Vera-Rodríguez
32
30
0
03 Jun 2022
BayesFormer: Transformer with Uncertainty Estimation
Karthik Abinav Sankararaman
Sinong Wang
Han Fang
UQCV
BDL
30
10
0
02 Jun 2022
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
25
1
0
01 Jun 2022
Transformer with Fourier Integral Attentions
T. Nguyen
Minh Pham
Tam Nguyen
Khai Nguyen
Stanley J. Osher
Nhat Ho
25
4
0
01 Jun 2022
Transformers for Multi-Object Tracking on Point Clouds
Felicia Ruppel
F. Faion
Claudius Gläser
Klaus C. J. Dietmayer
3DPC
26
17
0
31 May 2022
Prompt Injection: Parameterization of Fixed Inputs
Eunbi Choi
Yongrae Jo
Joel Jang
Minjoon Seo
18
29
0
31 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
78
2,045
0
27 May 2022
Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design
Jörg Franke
Frederic Runge
Frank Hutter
17
14
0
27 May 2022
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
239
128
0
25 May 2022
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions
Difan Liu
Sandesh Shetty
Tobias Hinz
Matthew Fisher
Richard Y. Zhang
Taesung Park
E. Kalogerakis
ViT
27
30
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
137
350
0
21 May 2022
Exploring Extreme Parameter Compression for Pre-trained Language Models
Yuxin Ren
Benyou Wang
Lifeng Shang
Xin Jiang
Qun Liu
33
18
0
20 May 2022
Transformer with Memory Replay
R. Liu
Barzan Mozafari
OffRL
70
4
0
19 May 2022
Text Detection & Recognition in the Wild for Robot Localization
Z. Raisi
John S. Zelek
22
0
0
17 May 2022
Transkimmer: Transformer Learns to Layer-wise Skim
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
80
38
0
15 May 2022
Symphony Generation with Permutation Invariant Language Model
Jiafeng Liu
Yuanliang Dong
Zehua Cheng
Xinran Zhang
Xiaobing Li
Feng Yu
Maosong Sun
21
39
0
10 May 2022
Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication
Yang Wang
Zhen Gao
Dezhi Zheng
Sheng Chen
Deniz Gündüz
H. Vincent Poor
AI4CE
19
83
0
08 May 2022
Transformers in Time-series Analysis: A Tutorial
Sabeen Ahmed
Ian E. Nielsen
Aakash Tripathi
Shamoon Siddiqui
Ghulam Rasool
R. Ramachandran
AI4TS
36
142
0
28 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Yujun Lin
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
27
107
0
25 Apr 2022
ChapterBreak: A Challenge Dataset for Long-Range Language Models
Simeng Sun
Katherine Thai
Mohit Iyyer
18
19
0
22 Apr 2022
On the Locality of Attention in Direct Speech Translation
Belen Alastruey
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
10
7
0
19 Apr 2022
Exploring Dimensionality Reduction Techniques in Multilingual Transformers
Álvaro Huertas-García
Alejandro Martín
Javier Huertas-Tato
David Camacho
29
7
0
18 Apr 2022
Usage of specific attention improves change point detection
Anna Dmitrienko
Evgenia Romanenkova
Alexey Zaytsev
13
0
0
18 Apr 2022
Multi-Frame Self-Supervised Depth with Transformers
Vitor Campagnolo Guizilini
Rares Andrei Ambrus
Di Chen
Sergey Zakharov
Adrien Gaidon
ViT
MDE
17
84
0
15 Apr 2022
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models
Phyllis Ang
Bhuwan Dhingra
Lisa Wu Wills
30
6
0
15 Apr 2022
Revisiting Transformer-based Models for Long Document Classification
Xiang Dai
Ilias Chalkidis
S. Darkner
Desmond Elliott
VLM
18
68
0
14 Apr 2022
Malceiver: Perceiver with Hierarchical and Multi-modal Features for Android Malware Detection
Niall McLaughlin
30
2
0
12 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
22
31
0
10 Apr 2022
BERTuit: Understanding Spanish language in Twitter through a native transformer
Javier Huertas-Tato
Alejandro Martín
David Camacho
20
9
0
07 Apr 2022
Scaling Language Model Size in Cross-Device Federated Learning
Jae Hun Ro
Theresa Breiner
Lara McConnaughey
Mingqing Chen
A. Suresh
Shankar Kumar
Rajiv Mathews
FedML
26
24
0
31 Mar 2022
MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
Alan Baade
Puyuan Peng
David Harwath
25
95
0
30 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
57
292
0
27 Mar 2022
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
31
296
0
27 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
27
28
0
24 Mar 2022
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
22
5
0
23 Mar 2022
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger
Thomas Paniagua
Xi Song
Naresh P. Cuntoor
Mun Wai Lee
Tianfu Wu
ViT
15
7
0
22 Mar 2022
Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation
Hensel Donato Jahja
N. Yudistira
Sutrisno
ViT
18
8
0
22 Mar 2022
Efficient Classification of Long Documents Using Transformers
Hyunji Hayley Park
Yogarshi Vyas
Kashif Shah
11
51
0
21 Mar 2022
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
30
173
0
16 Mar 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
16
6
0
15 Mar 2022
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
23
94
0
11 Mar 2022
WaveMix: Resource-efficient Token Mixing for Images
Pranav Jeevan
A. Sethi
17
10
0
07 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
36
22
0
28 Feb 2022
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems
Zhengyun Zhao
Qiao Jin
Fangyuan Chen
Tuorui Peng
Sheng Yu
PINN
19
34
0
28 Feb 2022
A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting
Benhan Li
Shengdong Du
Tianrui Li
AI4TS
20
2
0
23 Feb 2022
Ligandformer: A Graph Neural Network for Predicting Compound Property with Robust Interpretation
Jinjiang Guo
Qi Liu
Han Guo
Xi Lu
AI4CE
24
3
0
21 Feb 2022
Deep Learning for Hate Speech Detection: A Comparative Study
Jitendra Malik
Hezhe Qiao
Guansong Pang
Anton Van Den Hengel
51
44
0
19 Feb 2022
The NLP Task Effectiveness of Long-Range Transformers
Guanghui Qin
Yukun Feng
Benjamin Van Durme
12
28
0
16 Feb 2022
Previous
1
2
3
...
10
11
12
13
8
9
Next