Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.02155
Cited By
Self-Attention with Relative Position Representations
6 March 2018
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Attention with Relative Position Representations"
50 / 411 papers shown
Title
The Impact of Positional Encodings on Multilingual Compression
Vinit Ravishankar
Anders Søgaard
25
5
0
11 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
29
7
0
06 Sep 2021
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
Peng-Jen Chen
36
21
0
06 Sep 2021
ShopTalk: A System for Conversational Faceted Search
G. Manku
James Lee-Thorp
Bhargav Kanagal
Joshua Ainslie
Jingchen Feng
...
Jim Rosswog
Sumit Sanghai
Michael Pohl
Larry Adams
D. Sivakumar
15
1
0
02 Sep 2021
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
ViT
30
128
0
26 Aug 2021
An Effective Non-Autoregressive Model for Spoken Language Understanding
Lizhi Cheng
Weijia Jia
Wenmian Yang
OffRL
29
15
0
16 Aug 2021
MUSIQ: Multi-scale Image Quality Transformer
Junjie Ke
Qifei Wang
Yilin Wang
P. Milanfar
Feng Yang
177
628
0
12 Aug 2021
Learning Fair Face Representation With Progressive Cross Transformer
Yong Li
Yufei Sun
Zhen Cui
Shiguang Shan
Jian Yang
27
11
0
11 Aug 2021
Making Transformers Solve Compositional Tasks
Santiago Ontañón
Joshua Ainslie
Vaclav Cvicek
Zachary Kenneth Fisher
44
70
0
09 Aug 2021
Multi-Branch with Attention Network for Hand-Based Person Recognition
N. L. Baisa
Bryan M. Williams
Hossein Rahmani
Plamen Angelov
Sue Black
27
4
0
04 Aug 2021
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
Wenxiao Wang
Lulian Yao
Long Chen
Binbin Lin
Deng Cai
Xiaofei He
Wei Liu
34
258
0
31 Jul 2021
Rethinking and Improving Relative Position Encoding for Vision Transformer
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
ViT
53
330
0
29 Jul 2021
Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation
Yufei Wang
Can Xu
Huang Hu
Chongyang Tao
Stephen Wan
Mark Dras
Mark Johnson
Daxin Jiang
19
10
0
27 Jul 2021
Multi-Scale Local-Temporal Similarity Fusion for Continuous Sign Language Recognition
Pan Xie
Zhi Cui
Yao Du
Mengyi Zhao
Jianwei Cui
Bin Wang
Xiaohui Hu
SLR
23
32
0
27 Jul 2021
Residual Tree Aggregation of Layers for Neural Machine Translation
Guoliang Li
Yiyang Li
45
0
0
19 Jul 2021
DeepMutants: Training neural bug detectors with contextual mutations
Cedric Richter
Heike Wehrheim
19
3
0
14 Jul 2021
Conformer-based End-to-end Speech Recognition With Rotary Position Embedding
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
18
9
0
13 Jul 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Laohu Wang
Canan Huang
Tong Xiao
Jingbo Zhu
34
5
0
06 Jul 2021
Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN
Rahma Chaabouni
Roberto Dessì
Eugene Kharitonov
32
20
0
03 Jul 2021
Polarized Self-Attention: Towards High-quality Pixel-wise Regression
Huajun Liu
Fuqiang Liu
Xinyi Fan
Dong Huang
79
211
0
02 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
36
259
0
01 Jul 2021
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
58
749
0
25 Jun 2021
Probabilistic Attention for Interactive Segmentation
Prasad Gabbur
Manjot Bilkhu
J. Movellan
34
13
0
23 Jun 2021
JointGT: Graph-Text Joint Representation Learning for Text Generation from Knowledge Graphs
Pei Ke
Haozhe Ji
Yuanyuan Ran
Xin Cui
Liwei Wang
Linfeng Song
Xiaoyan Zhu
Minlie Huang
59
95
0
19 Jun 2021
Improving Compositional Generalization in Classification Tasks via Structure Annotations
Juyong Kim
Pradeep Ravikumar
Joshua Ainslie
Santiago Ontañón
CoGe
21
18
0
19 Jun 2021
Multi-head or Single-head? An Empirical Comparison for Transformer Training
Liyuan Liu
Jialu Liu
Jiawei Han
23
32
0
17 Jun 2021
Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Jerret Ross
Brian M. Belgodere
Vijil Chenthamarakshan
Inkit Padhi
Youssef Mroueh
Payel Das
AI4CE
27
272
0
17 Jun 2021
Structure-Regularized Attention for Deformable Object Representation
Shenao Zhang
Li Shen
Zhifeng Li
Wei Liu
12
1
0
12 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
86
843
0
11 Jun 2021
GraphiT: Encoding Graph Structure in Transformers
Grégoire Mialon
Dexiong Chen
Margot Selosse
Julien Mairal
34
164
0
10 Jun 2021
Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying
Tianle Cai
Shengjie Luo
Shuxin Zheng
Guolin Ke
Di He
Yanming Shen
Tie-Yan Liu
GNN
33
433
0
09 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
49
1,167
0
09 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
Learning to Efficiently Sample from Diffusion Probabilistic Models
Daniel Watson
Jonathan Ho
Mohammad Norouzi
William Chan
DiffM
45
134
0
07 Jun 2021
Efficient Training of Visual Transformers with Small Datasets
Yahui Liu
E. Sangineto
Wei Bi
N. Sebe
Bruno Lepri
Marco De Nadai
ViT
36
167
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
65
329
0
07 Jun 2021
Scalable Transformers for Neural Machine Translation
Peng Gao
Shijie Geng
Ping Luo
Xiaogang Wang
Jifeng Dai
Hongsheng Li
31
13
0
04 Jun 2021
An Improved Model for Voicing Silent Speech
David Gaddy
Dana Klein
26
30
0
03 Jun 2021
LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations
Ruisheng Cao
Lu Chen
Zhi Chen
Yanbin Zhao
Su Zhu
Kai Yu
22
159
0
02 Jun 2021
Link Prediction on N-ary Relational Facts: A Graph-based Approach
Quan Wang
Haifeng Wang
Yajuan Lyu
Yong Zhu
24
46
0
18 May 2021
Relative Positional Encoding for Transformers with Linear Complexity
Antoine Liutkus
Ondřej Cífka
Shih-Lun Wu
Umut Simsekli
Yi-Hsuan Yang
Gaël Richard
33
44
0
18 May 2021
HCRF-Flow: Scene Flow from Point Clouds with Continuous High-order CRFs and Position-aware Flow Embedding
Ruibo Li
Guosheng Lin
Tong He
Fayao Liu
Chunhua Shen
3DPC
41
56
0
17 May 2021
MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE
Shih-Lun Wu
Yi-Hsuan Yang
ViT
25
53
0
10 May 2021
AGMB-Transformer: Anatomy-Guided Multi-Branch Transformer Network for Automated Evaluation of Root Canal Therapy
Yunxiang Li
G. Zeng
Yifan Zhang
Jun Wang
Qianni Zhang
...
Neng Xia
Ruizi Peng
Kai Tang
Yaqi Wang
Shuai Wang
MedIm
AI4CE
92
28
0
02 May 2021
Incorporating Transformer and LSTM to Kalman Filter with EM algorithm for state estimation
Zhuang Shi
35
11
0
01 May 2021
GasHis-Transformer: A Multi-scale Visual Transformer Approach for Gastric Histopathological Image Detection
Hao Chen
Chen Li
Ge Wang
Xirong Li
M. Rahaman
...
Yixin Li
Wanli Liu
Changhao Sun
Shiliang Ai
M. Grzegorzek
ViT
MedIm
34
182
0
29 Apr 2021
ConTNet: Why not use convolution and transformer at the same time?
Haotian Yan
Zhe Li
Weijian Li
Changhu Wang
Ming Wu
Chuang Zhang
ViT
20
76
0
27 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
38
2,190
0
20 Apr 2021
Question Decomposition with Dependency Graphs
Matan Hasson
Jonathan Berant
GNN
39
9
0
17 Apr 2021
Mask Attention Networks: Rethinking and Strengthen Transformer
Zhihao Fan
Yeyun Gong
Dayiheng Liu
Zhongyu Wei
Siyuan Wang
Jian Jiao
Nan Duan
Ruofei Zhang
Xuanjing Huang
34
72
0
25 Mar 2021
Previous
1
2
3
4
5
6
7
8
9
Next