Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.02155
Cited By
Self-Attention with Relative Position Representations
6 March 2018
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Attention with Relative Position Representations"
50 / 411 papers shown
Title
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
27
395
0
23 Mar 2021
API2Com: On the Improvement of Automatically Generated Code Comments Using API Documentations
Ramin Shahbazi
Rishab Sharma
Fatemeh H. Fard
24
25
0
19 Mar 2021
An End-to-End Network for Emotion-Cause Pair Extraction
Aaditya Singh
Shreeshail Hingane
Saim Wani
Ashutosh Modi
24
38
0
02 Mar 2021
Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction
Benfeng Xu
Quan Wang
Yajuan Lyu
Yong Zhu
Zhendong Mao
27
166
0
20 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
281
179
0
17 Feb 2021
Revisiting Language Encoding in Learning Multilingual Representations
Shengjie Luo
Kaiyuan Gao
Shuxin Zheng
Guolin Ke
Di He
Liwei Wang
Tie-Yan Liu
34
2
0
16 Feb 2021
TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up
Yi Ding
Shiyu Chang
Zhangyang Wang
ViT
29
382
0
14 Feb 2021
Transformer Language Models with LSTM-based Cross-utterance Information Representation
G. Sun
C. Zhang
P. Woodland
76
32
0
12 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
277
525
0
04 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
290
980
0
27 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,431
0
04 Jan 2021
Code Generation from Natural Language with Less Prior and More Monolingual Data
Sajad Norouzi
Keyi Tang
Yanshuai Cao
14
19
0
01 Jan 2021
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
230
89
0
31 Dec 2020
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
71
52
0
31 Dec 2020
Optimizing Deeper Transformers on Small Datasets
Peng Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
24
68
0
30 Dec 2020
Code Summarization with Structure-induced Transformer
Hongqiu Wu
Hai Zhao
Min Zhang
41
84
0
29 Dec 2020
Portfolio Optimization with 2D Relative-Attentional Gated Transformer
Tae Wan Kim
Matloob Khushi
AI4TS
28
12
0
27 Dec 2020
Learning Light-Weight Translation Models from Deep Transformer
Bei Li
Ziyang Wang
Hui Liu
Quan Du
Tong Xiao
Chunliang Zhang
Jingbo Zhu
VLM
120
40
0
27 Dec 2020
Learning to Represent Programs with Heterogeneous Graphs
Kechi Zhang
Wenhan Wang
Huangzhao Zhang
Ge Li
Zhi Jin
GNN
21
63
0
08 Dec 2020
Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction
Anzhu Yu
Wenyue Guo
Bing Liu
Xin Chen
Xin Wang
Xuefeng Cao
Bingchuan Jiang
3DV
26
64
0
25 Nov 2020
Persuasive Dialogue Understanding: the Baselines and Negative Results
Hui Chen
Deepanway Ghosal
Navonil Majumder
Amir Hussain
Soujanya Poria
23
8
0
19 Nov 2020
s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis
Xi Wang
Huaiping Ming
Lei He
Frank Soong
14
5
0
17 Nov 2020
Blind Deinterleaving of Signals in Time Series with Self-attention Based Soft Min-cost Flow Learning
Ougul Can
Y. Z. Gürbüz
B. Yildirim
A. Aydin Alatan
AI4TS
17
4
0
24 Oct 2020
A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code
Nadezhda Chirkova
Sergey Troshin
47
12
0
23 Oct 2020
SmBoP: Semi-autoregressive Bottom-up Semantic Parsing
Ohad Rubin
Jonathan Berant
138
150
0
23 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
22
169
0
22 Oct 2020
DuoRAT: Towards Simpler Text-to-SQL Models
Torsten Scholak
Raymond Li
Dzmitry Bahdanau
H. D. Vries
C. Pal
AI4TS
35
26
0
21 Oct 2020
Predicting Chemical Properties using Self-Attention Multi-task Learning based on SMILES Representation
Sangrak Lim
Yong Oh Lee
27
17
0
19 Oct 2020
MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response
Jiancheng Yang
Jiajun Chen
Kaiming Kuang
Tiancheng Lin
Junjun He
Bingbing Ni
31
8
0
08 Oct 2020
SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling
Di Wu
Liang Ding
Fan Lu
Jian Xie
VLM
BDL
29
80
0
06 Oct 2020
SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization
Yue Yu
Kexin Huang
Chao Zhang
Lucas Glass
Jimeng Sun
Cao Xiao
28
120
0
04 Oct 2020
Improve Transformer Models with Better Relative Position Embeddings
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
ViT
15
127
0
28 Sep 2020
Temporally Guided Music-to-Body-Movement Generation
Hsuan-Kai Kao
Li Su
44
42
0
17 Sep 2020
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Wenyong Huang
Wenchao Hu
Y. Yeung
Xiao Chen
25
50
0
13 Aug 2020
Select, Extract and Generate: Neural Keyphrase Generation with Layer-wise Coverage Attention
Wasi Uddin Ahmad
Xiaoyu Bai
Soomin Lee
Kai-Wei Chang
41
36
0
04 Aug 2020
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen
Wenhao Jiang
Wei Liu
Yu-Gang Jiang
25
101
0
28 Jul 2020
RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition
Xiaoyu Yue
Zhanghui Kuang
Chenhao Lin
Hongbin Sun
Wayne Zhang
28
160
0
15 Jul 2020
Rewiring the Transformer with Depth-Wise LSTMs
Hongfei Xu
Yang Song
Qiuhui Liu
Josef van Genabith
Deyi Xiong
42
6
0
13 Jul 2020
Transformer-XL Based Music Generation with Multiple Sequences of Time-valued Notes
Xianchao Wu
Chengyuan Wang
Qinying Lei
14
19
0
11 Jul 2020
Hybrid Models for Learning to Branch
Prateek Gupta
Maxime Gasse
Elias Boutros Khalil
M. P. Kumar
Andrea Lodi
Yoshua Bengio
GNN
22
123
0
26 Jun 2020
SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks
F. Fuchs
Daniel E. Worrall
Volker Fischer
Max Welling
3DPC
45
667
0
18 Jun 2020
On the Computational Power of Transformers and its Implications in Sequence Modeling
S. Bhattamishra
Arkil Patel
Navin Goyal
33
65
0
16 Jun 2020
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting
Giorgos Bouritsas
Fabrizio Frasca
S. Zafeiriou
M. Bronstein
58
424
0
16 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
64
2,622
0
05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
54
475
0
22 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
33
30
0
20 May 2020
How Does Selective Mechanism Improve Self-Attention Networks?
Xinwei Geng
Longyue Wang
Xing Wang
Bing Qin
Ting Liu
Zhaopeng Tu
AAML
39
35
0
03 May 2020
Hard-Coded Gaussian Attention for Neural Machine Translation
Weiqiu You
Simeng Sun
Mohit Iyyer
22
67
0
02 May 2020
A Transformer-based Approach for Source Code Summarization
Wasi Uddin Ahmad
Saikat Chakraborty
Baishakhi Ray
Kai-Wei Chang
ViT
16
375
0
01 May 2020
Capsule-Transformer for Neural Machine Translation
Sufeng Duan
Juncheng Cao
Hai Zhao
MedIm
27
4
0
30 Apr 2020
Previous
1
2
3
4
5
6
7
8
9
Next