ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.02155
  4. Cited By
Self-Attention with Relative Position Representations

Self-Attention with Relative Position Representations

6 March 2018
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
ArXivPDFHTML

Papers citing "Self-Attention with Relative Position Representations"

50 / 411 papers shown
Title
Self-Attention with Cross-Lingual Position Representation
Self-Attention with Cross-Lingual Position Representation
Liang Ding
Longyue Wang
Dacheng Tao
MILM
33
37
0
28 Apr 2020
Lite Transformer with Long-Short Range Attention
Lite Transformer with Long-Short Range Attention
Zhanghao Wu
Zhijian Liu
Ji Lin
Yujun Lin
Song Han
23
317
0
24 Apr 2020
Vector Quantized Contrastive Predictive Coding for Template-based Music
  Generation
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
Gaëtan Hadjeres
Léopold Crestel
34
18
0
21 Apr 2020
DIET: Lightweight Language Understanding for Dialogue Systems
DIET: Lightweight Language Understanding for Dialogue Systems
Tanja Bunk
Daksh Varshneya
Vladimir Vlasov
Alan Nichol
27
160
0
21 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
23
16
0
17 Apr 2020
Improving Scholarly Knowledge Representation: Evaluating BERT-based
  Models for Scientific Relation Classification
Improving Scholarly Knowledge Representation: Evaluating BERT-based Models for Scientific Relation Classification
Ming Jiang
Jennifer D'Souza
Sören Auer
J. S. Downie
25
11
0
13 Apr 2020
Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency
  Parsing with Iterative Refinement
Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement
Alireza Mohammadshahi
James Henderson
35
33
0
29 Mar 2020
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive
  Connection
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection
Xiaoya Li
Yuxian Meng
Mingxin Zhou
Qinghong Han
Fei Wu
Jiwei Li
27
20
0
22 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
135
189
0
19 Mar 2020
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang
Yukun Zhu
Bradley Green
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
3DPC
28
658
0
17 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical
  Model
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
16
107
0
13 Mar 2020
Heterogeneous Graph Transformer
Heterogeneous Graph Transformer
Ziniu Hu
Yuxiao Dong
Kuansan Wang
Yizhou Sun
185
1,170
0
03 Mar 2020
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
A. Torfi
Rouzbeh A. Shirvani
Yaser Keneshloo
Nader Tavvaf
Edward A. Fox
AI4CE
VLM
83
216
0
02 Mar 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine
  Translation
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
32
92
0
24 Feb 2020
Transformer Hawkes Process
Transformer Hawkes Process
Simiao Zuo
Haoming Jiang
Zichong Li
T. Zhao
H. Zha
AI4TS
19
286
0
21 Feb 2020
Molecule Attention Transformer
Molecule Attention Transformer
Lukasz Maziarka
Tomasz Danel
Slawomir Mucha
Krzysztof Rataj
Jacek Tabor
Stanislaw Jastrzebski
19
168
0
19 Feb 2020
LAMBERT: Layout-Aware (Language) Modeling for information extraction
LAMBERT: Layout-Aware (Language) Modeling for information extraction
Lukasz Garncarek
Rafal Powalski
Tomasz Stanislawek
Bartosz Topolski
Piotr Halama
M. Turski
Filip Graliñski
8
87
0
19 Feb 2020
A Survey of Deep Learning Techniques for Neural Machine Translation
A Survey of Deep Learning Techniques for Neural Machine Translation
Shu Yang
Yuxin Wang
Xiaowen Chu
VLM
AI4TS
AI4CE
22
138
0
18 Feb 2020
LAVA NAT: A Non-Autoregressive Translation Model with Look-Around
  Decoding and Vocabulary Attention
LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention
Xiaoya Li
Yuxian Meng
Arianna Yuan
Fei Wu
Jiwei Li
40
12
0
08 Feb 2020
Pop Music Transformer: Beat-based Modeling and Generation of Expressive
  Pop Piano Compositions
Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions
Yu-Siang Huang
Yi-Hsuan Yang
ViT
22
39
0
01 Feb 2020
Attention! A Lightweight 2D Hand Pose Estimation Approach
Attention! A Lightweight 2D Hand Pose Estimation Approach
Nicholas Santavas
Ioannis Kansizoglou
Loukas Bampis
E. Karakasis
Antonios Gasteratos
9
50
0
22 Jan 2020
SANST: A Self-Attentive Network for Next Point-of-Interest
  Recommendation
SANST: A Self-Attentive Network for Next Point-of-Interest Recommendation
Qi Guo
Jianzhong Qi
AI4TS
13
8
0
22 Jan 2020
Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text
  Segmentation
Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation
Goran Glavas
Swapna Somasundaran
VLM
23
55
0
03 Jan 2020
Encoding word order in complex embeddings
Encoding word order in complex embeddings
Benyou Wang
Donghao Zhao
Christina Lioma
Qiuchi Li
Peng Zhang
J. Simonsen
16
111
0
27 Dec 2019
Explicit Sparse Transformer: Concentrated Attention Through Explicit
  Selection
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
22
108
0
25 Dec 2019
Encoding Musical Style with Transformer Autoencoders
Encoding Musical Style with Transformer Autoencoders
Kristy Choi
Curtis Hawthorne
Ian Simon
Monica Dinculescu
Jesse Engel
33
89
0
10 Dec 2019
Neural Machine Translation: A Review and Survey
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
20
312
0
04 Dec 2019
Graph Transformer for Graph-to-Sequence Learning
Graph Transformer for Graph-to-Sequence Learning
Deng Cai
W. Lam
32
221
0
18 Nov 2019
What do you mean, BERT? Assessing BERT as a Distributional Semantics
  Model
What do you mean, BERT? Assessing BERT as a Distributional Semantics Model
Timothee Mickus
Denis Paperno
Mathieu Constant
Kees van Deemter
26
45
0
13 Nov 2019
Location Attention for Extrapolation to Longer Sequences
Location Attention for Extrapolation to Longer Sequences
Yann Dubois
Gautier Dagan
Dieuwke Hupkes
Elia Bruni
23
40
0
10 Nov 2019
Syntax-Infused Transformer and BERT models for Machine Translation and
  Natural Language Understanding
Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
Dhanasekar Sundararaman
Vivek Subramanian
Guoyin Wang
Shijing Si
Dinghan Shen
Dong Wang
Lawrence Carin
19
40
0
10 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from
  Transformers
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
21
196
0
09 Nov 2019
Improving Generalization of Transformer for Speech Recognition with
  Parallel Schedule Sampling and Relative Positional Embedding
Improving Generalization of Transformer for Speech Recognition with Parallel Schedule Sampling and Relative Positional Embedding
Pan Zhou
Ruchao Fan
Wei Chen
Jia Jia
11
26
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
121
19,493
0
23 Oct 2019
Multilingual Neural Machine Translation for Zero-Resource Languages
Multilingual Neural Machine Translation for Zero-Resource Languages
Surafel Melaku Lakew
Marcello Federico
Mattia Antonino Di Gangi
Marco Turchi
33
15
0
16 Sep 2019
Reasoning Over Semantic-Level Graph for Fact Checking
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
185
165
0
09 Sep 2019
Improving Multi-Head Attention with Capsule Networks
Improving Multi-Head Attention with Capsule Networks
Shuhao Gu
Yang Feng
17
12
0
31 Aug 2019
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
S. Rothe
Shashi Narayan
Aliaksei Severyn
SILM
69
433
0
29 Jul 2019
Investigating Self-Attention Network for Chinese Word Segmentation
Investigating Self-Attention Network for Chinese Word Segmentation
Leilei Gan
Yue Zhang
21
11
0
26 Jul 2019
Program Synthesis and Semantic Parsing with Learned Code Idioms
Program Synthesis and Semantic Parsing with Learned Code Idioms
Richard Shin
Miltiadis Allamanis
Marc Brockschmidt
Oleksandr Polozov
24
87
0
26 Jun 2019
Lattice Transformer for Speech Translation
Lattice Transformer for Speech Translation
Pei Zhang
Boxing Chen
Niyu Ge
Kai Fan
34
48
0
13 Jun 2019
Lattice-Based Transformer Encoder for Neural Machine Translation
Lattice-Based Transformer Encoder for Neural Machine Translation
Fengshun Xiao
Jiangtong Li
Zhao Hai
Rui Wang
Kehai Chen
29
42
0
04 Jun 2019
Language Modeling with Deep Transformers
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
41
172
0
10 May 2019
Attention Augmented Convolutional Networks
Attention Augmented Convolutional Networks
Irwan Bello
Barret Zoph
Ashish Vaswani
Jonathon Shlens
Quoc V. Le
46
999
0
22 Apr 2019
Convolutional Self-Attention Networks
Convolutional Self-Attention Networks
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
24
124
0
05 Apr 2019
Modeling Recurrence for Transformer
Modeling Recurrence for Transformer
Jie Hao
Xing Wang
Baosong Yang
Longyue Wang
Jinfeng Zhang
Zhaopeng Tu
45
85
0
05 Apr 2019
Context-Aware Self-Attention Networks
Context-Aware Self-Attention Networks
Baosong Yang
Jian Li
Derek F. Wong
Lidia S. Chao
Xing Wang
Zhaopeng Tu
39
113
0
15 Feb 2019
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Haoyu Wang
Ming Tan
Mo Yu
Shiyu Chang
Dakuo Wang
Kun Xu
Xiaoxiao Guo
Saloni Potdar
ViT
29
97
0
04 Feb 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
38
3,674
0
09 Jan 2019
Dynamic Graph Representation Learning via Self-Attention Networks
Dynamic Graph Representation Learning via Self-Attention Networks
Aravind Sankar
Yanhong Wu
Liang Gou
Wei Zhang
Hao Yang
GNN
22
119
0
22 Dec 2018
Previous
123456789
Next