Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.02155
Cited By
Self-Attention with Relative Position Representations
6 March 2018
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Attention with Relative Position Representations"
50 / 411 papers shown
Title
Self-Attention with Cross-Lingual Position Representation
Liang Ding
Longyue Wang
Dacheng Tao
MILM
33
37
0
28 Apr 2020
Lite Transformer with Long-Short Range Attention
Zhanghao Wu
Zhijian Liu
Ji Lin
Yujun Lin
Song Han
23
317
0
24 Apr 2020
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
Gaëtan Hadjeres
Léopold Crestel
34
18
0
21 Apr 2020
DIET: Lightweight Language Understanding for Dialogue Systems
Tanja Bunk
Daksh Varshneya
Vladimir Vlasov
Alan Nichol
27
160
0
21 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
23
16
0
17 Apr 2020
Improving Scholarly Knowledge Representation: Evaluating BERT-based Models for Scientific Relation Classification
Ming Jiang
Jennifer D'Souza
Sören Auer
J. S. Downie
25
11
0
13 Apr 2020
Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement
Alireza Mohammadshahi
James Henderson
35
33
0
29 Mar 2020
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection
Xiaoya Li
Yuxian Meng
Mingxin Zhou
Qinghong Han
Fei Wu
Jiwei Li
27
20
0
22 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
135
189
0
19 Mar 2020
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang
Yukun Zhu
Bradley Green
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
3DPC
28
658
0
17 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
16
107
0
13 Mar 2020
Heterogeneous Graph Transformer
Ziniu Hu
Yuxiao Dong
Kuansan Wang
Yizhou Sun
185
1,170
0
03 Mar 2020
Natural Language Processing Advancements By Deep Learning: A Survey
A. Torfi
Rouzbeh A. Shirvani
Yaser Keneshloo
Nader Tavvaf
Edward A. Fox
AI4CE
VLM
83
216
0
02 Mar 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
32
92
0
24 Feb 2020
Transformer Hawkes Process
Simiao Zuo
Haoming Jiang
Zichong Li
T. Zhao
H. Zha
AI4TS
19
286
0
21 Feb 2020
Molecule Attention Transformer
Lukasz Maziarka
Tomasz Danel
Slawomir Mucha
Krzysztof Rataj
Jacek Tabor
Stanislaw Jastrzebski
19
168
0
19 Feb 2020
LAMBERT: Layout-Aware (Language) Modeling for information extraction
Lukasz Garncarek
Rafal Powalski
Tomasz Stanislawek
Bartosz Topolski
Piotr Halama
M. Turski
Filip Graliñski
8
87
0
19 Feb 2020
A Survey of Deep Learning Techniques for Neural Machine Translation
Shu Yang
Yuxin Wang
Xiaowen Chu
VLM
AI4TS
AI4CE
22
138
0
18 Feb 2020
LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention
Xiaoya Li
Yuxian Meng
Arianna Yuan
Fei Wu
Jiwei Li
40
12
0
08 Feb 2020
Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions
Yu-Siang Huang
Yi-Hsuan Yang
ViT
22
39
0
01 Feb 2020
Attention! A Lightweight 2D Hand Pose Estimation Approach
Nicholas Santavas
Ioannis Kansizoglou
Loukas Bampis
E. Karakasis
Antonios Gasteratos
9
50
0
22 Jan 2020
SANST: A Self-Attentive Network for Next Point-of-Interest Recommendation
Qi Guo
Jianzhong Qi
AI4TS
13
8
0
22 Jan 2020
Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation
Goran Glavas
Swapna Somasundaran
VLM
23
55
0
03 Jan 2020
Encoding word order in complex embeddings
Benyou Wang
Donghao Zhao
Christina Lioma
Qiuchi Li
Peng Zhang
J. Simonsen
16
111
0
27 Dec 2019
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
22
108
0
25 Dec 2019
Encoding Musical Style with Transformer Autoencoders
Kristy Choi
Curtis Hawthorne
Ian Simon
Monica Dinculescu
Jesse Engel
33
89
0
10 Dec 2019
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
20
312
0
04 Dec 2019
Graph Transformer for Graph-to-Sequence Learning
Deng Cai
W. Lam
32
221
0
18 Nov 2019
What do you mean, BERT? Assessing BERT as a Distributional Semantics Model
Timothee Mickus
Denis Paperno
Mathieu Constant
Kees van Deemter
26
45
0
13 Nov 2019
Location Attention for Extrapolation to Longer Sequences
Yann Dubois
Gautier Dagan
Dieuwke Hupkes
Elia Bruni
23
40
0
10 Nov 2019
Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
Dhanasekar Sundararaman
Vivek Subramanian
Guoyin Wang
Shijing Si
Dinghan Shen
Dong Wang
Lawrence Carin
19
40
0
10 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
21
196
0
09 Nov 2019
Improving Generalization of Transformer for Speech Recognition with Parallel Schedule Sampling and Relative Positional Embedding
Pan Zhou
Ruchao Fan
Wei Chen
Jia Jia
11
26
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
121
19,493
0
23 Oct 2019
Multilingual Neural Machine Translation for Zero-Resource Languages
Surafel Melaku Lakew
Marcello Federico
Mattia Antonino Di Gangi
Marco Turchi
33
15
0
16 Sep 2019
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
185
165
0
09 Sep 2019
Improving Multi-Head Attention with Capsule Networks
Shuhao Gu
Yang Feng
17
12
0
31 Aug 2019
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
S. Rothe
Shashi Narayan
Aliaksei Severyn
SILM
69
433
0
29 Jul 2019
Investigating Self-Attention Network for Chinese Word Segmentation
Leilei Gan
Yue Zhang
21
11
0
26 Jul 2019
Program Synthesis and Semantic Parsing with Learned Code Idioms
Richard Shin
Miltiadis Allamanis
Marc Brockschmidt
Oleksandr Polozov
24
87
0
26 Jun 2019
Lattice Transformer for Speech Translation
Pei Zhang
Boxing Chen
Niyu Ge
Kai Fan
34
48
0
13 Jun 2019
Lattice-Based Transformer Encoder for Neural Machine Translation
Fengshun Xiao
Jiangtong Li
Zhao Hai
Rui Wang
Kehai Chen
29
42
0
04 Jun 2019
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
41
172
0
10 May 2019
Attention Augmented Convolutional Networks
Irwan Bello
Barret Zoph
Ashish Vaswani
Jonathon Shlens
Quoc V. Le
46
999
0
22 Apr 2019
Convolutional Self-Attention Networks
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
24
124
0
05 Apr 2019
Modeling Recurrence for Transformer
Jie Hao
Xing Wang
Baosong Yang
Longyue Wang
Jinfeng Zhang
Zhaopeng Tu
45
85
0
05 Apr 2019
Context-Aware Self-Attention Networks
Baosong Yang
Jian Li
Derek F. Wong
Lidia S. Chao
Xing Wang
Zhaopeng Tu
39
113
0
15 Feb 2019
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Haoyu Wang
Ming Tan
Mo Yu
Shiyu Chang
Dakuo Wang
Kun Xu
Xiaoxiao Guo
Saloni Potdar
ViT
29
97
0
04 Feb 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
38
3,674
0
09 Jan 2019
Dynamic Graph Representation Learning via Self-Attention Networks
Aravind Sankar
Yanhong Wu
Liang Gou
Wei Zhang
Hao Yang
GNN
22
119
0
22 Dec 2018
Previous
1
2
3
4
5
6
7
8
9
Next