Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attention Is All You Need"
30 / 27,180 papers shown
Title
Self-Attentive Residual Decoder for Neural Machine Translation
Lesly Miculicich
Nikolaos Pappas
Dhananjay Ram
Andrei Popescu-Belis
52
20
0
14 Sep 2017
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Shirui Pan
Chengqi Zhang
119
757
0
14 Sep 2017
Natural Language Inference over Interaction Space
Yichen Gong
Heng Luo
Jian Zhang
115
265
0
13 Sep 2017
Refining Source Representations with Relation Networks for Neural Machine Translation
Wen Zhang
Jiawei Hu
Yang Feng
Qun Liu
41
7
0
12 Sep 2017
Simple Recurrent Units for Highly Parallelizable Recurrence
Tao Lei
Yu Zhang
Sida I. Wang
Huijing Dai
Yoav Artzi
LRM
163
277
0
08 Sep 2017
Deep Learning Techniques for Music Generation -- A Survey
Jean-Pierre Briot
Gaëtan Hadjeres
F. Pachet
MGen
153
300
0
05 Sep 2017
Squeeze-and-Excitation Networks
Jie Hu
Li Shen
Samuel Albanie
Gang Sun
Enhua Wu
431
26,701
0
05 Sep 2017
Natural Language Processing: State of The Art, Current Trends and Challenges
Diksha Khurana
Aditya Koli
Kiran Khatter
Sukhdev Singh
65
1,079
0
17 Aug 2017
Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification
Yunlong Bian
Chuang Gan
Xiao-Chang Liu
Fu Li
Xiang Long
Yandong Li
Heng Qi
Jie Zhou
Shilei Wen
Yuanqing Lin
85
48
0
12 Aug 2017
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Jianping Fan
Dacheng Tao
98
462
0
10 Aug 2017
Recent Trends in Deep Learning Based Natural Language Processing
Tom Young
Devamanyu Hazarika
Soujanya Poria
Min Zhang
99
2,846
0
09 Aug 2017
Deep Architectures for Neural Machine Translation
Antonio Valerio Miceli Barone
Jindřich Helcl
Rico Sennrich
Barry Haddow
Alexandra Birch
88
112
0
24 Jul 2017
A Simple Neural Attentive Meta-Learner
Nikhil Mishra
Mostafa Rohaninejad
Xi Chen
Pieter Abbeel
OOD
109
200
0
11 Jul 2017
Dual Supervised Learning
Yingce Xia
Tao Qin
Wei-neng Chen
Jiang Bian
Nenghai Yu
Tie-Yan Liu
SSL
142
143
0
03 Jul 2017
VAIN: Attentional Multi-agent Predictive Modeling
Yedid Hoshen
GNN
103
241
0
19 Jun 2017
One Model To Learn Them All
Lukasz Kaiser
Aidan Gomez
Noam M. Shazeer
Ashish Vaswani
Niki Parmar
Llion Jones
Jakob Uszkoreit
VLM
ViT
87
334
0
16 Jun 2017
Depthwise Separable Convolutions for Neural Machine Translation
Lukasz Kaiser
Aidan Gomez
François Chollet
74
279
0
09 Jun 2017
Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs
Jean Maillard
S. Clark
Dani Yogatama
77
89
0
25 May 2017
Recurrent Additive Networks
Kenton Lee
Omer Levy
Luke Zettlemoyer
GNN
AI4CE
90
38
0
21 May 2017
Reinforced Mnemonic Reader for Machine Reading Comprehension
Minghao Hu
Yuxing Peng
Zhen Huang
Xipeng Qiu
Furu Wei
Ming Zhou
RALM
AIMat
97
69
0
08 May 2017
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
Jacob Devlin
73
36
0
04 May 2017
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
Zhen-Le Yang
Wei Chen
Feng Wang
Bo Xu
GAN
AI4CE
89
170
0
15 Mar 2017
Structured Attention Networks
Yoon Kim
Carl Denton
Luong Hoang
Alexander M. Rush
146
463
0
03 Feb 2017
Symbolic, Distributed and Distributional Representations for Natural Language Processing in the Era of Deep Learning: a Survey
L. Ferrone
Fabio Massimo Zanzotto
49
38
0
02 Feb 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
314
1,549
0
25 Jan 2017
Boosting Neural Machine Translation
Dakun Zhang
Jungi Kim
Josep Crego
Jean Senellart
AI4CE
73
26
0
19 Dec 2016
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model
Marcella Cornia
Lorenzo Baraldi
G. Serra
Rita Cucchiara
130
551
0
29 Nov 2016
One Sentence One Model for Neural Machine Translation
Xiaoqing Li
Jiajun Zhang
Chengqing Zong
AI4CE
165
62
0
21 Sep 2016
Quantifying the probable approximation error of probabilistic inference programs
Marco F. Cusumano-Towner
Vikash K. Mansinghka
100
7
0
31 May 2016
Impact of Power System Partitioning on the Efficiency of Distributed Multi-Step Optimization
Dongliang Chen
A. Bucchiarone
Zhihan Lv
42
12
0
31 May 2016
Previous
1
2
3
...
542
543
544