Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 26,904 papers shown
Title
Global Pose Estimation with an Attention-based Recurrent Network
Emilio Parisotto
Devendra Singh Chaplot
Jian Zhang
Ruslan Salakhutdinov
58
70
0
19 Feb 2018
Building a Word Segmenter for Sanskrit Overnight
V. Reddy
Amrith Krishna
V. Sharma
Prateek Gupta
R. VineethM.
Pawan Goyal
43
18
0
17 Feb 2018
Image Transformer
Niki Parmar
Ashish Vaswani
Jakob Uszkoreit
Lukasz Kaiser
Noam M. Shazeer
Alexander Ku
Dustin Tran
ViT
159
1,690
0
15 Feb 2018
Model compression via distillation and quantization
A. Polino
Razvan Pascanu
Dan Alistarh
MQ
91
734
0
15 Feb 2018
Universal Neural Machine Translation for Extremely Low Resource Languages
Jiatao Gu
Hany Hassan
Jacob Devlin
Victor O.K. Li
101
277
0
15 Feb 2018
Multimodal Generative Models for Scalable Weakly-Supervised Learning
Mike Wu
Noah D. Goodman
DRL
104
382
0
14 Feb 2018
G
\mathcal{G}
G
-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space
Qi Meng
Shuxin Zheng
Huishuai Zhang
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
129
39
0
11 Feb 2018
Tree-to-tree Neural Networks for Program Translation
Xinyun Chen
Chang-rui Liu
Basel Alomair
98
279
0
11 Feb 2018
On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks
Yukun Ding
Jinglan Liu
Jinjun Xiong
Yiyu Shi
MQ
117
21
0
10 Feb 2018
Online Learning for Effort Reduction in Interactive Neural Machine Translation
Álvaro Peris
F. Casacuberta
67
49
0
10 Feb 2018
Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning
Myeongjun Jang
Seungwan Seo
Pilsung Kang
DRL
86
56
0
09 Feb 2018
Zero-Resource Neural Machine Translation with Multi-Agent Communication Game
Yun Chen
Yang Liu
Victor O.K. Li
144
48
0
09 Feb 2018
Question-Answer Selection in User to User Marketplace Conversations
Girish Kumar
Matthew Henderson
Shannon Chan
Hoang-Diep Nguyen
L. Ngoo
50
8
0
06 Feb 2018
Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Sen Wang
Chengqi Zhang
AI4TS
135
144
0
31 Jan 2018
Generating Wikipedia by Summarizing Long Sequences
Peter J. Liu
Mohammad Saleh
Etienne Pot
Ben Goodrich
Ryan Sepassi
Lukasz Kaiser
Noam M. Shazeer
CVBM
221
801
0
30 Jan 2018
Discrete Autoencoders for Sequence Models
Lukasz Kaiser
Samy Bengio
BDL
92
50
0
29 Jan 2018
Multi-Pointer Co-Attention Networks for Recommendation
Yi Tay
Anh Tuan Luu
S. Hui
3DV
191
290
0
28 Jan 2018
MaskGAN: Better Text Generation via Filling in the______
W. Fedus
Ian Goodfellow
Andrew M. Dai
123
470
0
23 Jan 2018
Unsupervised Cipher Cracking Using Discrete GANs
Aidan Gomez
Sicong Huang
Ivan Zhang
Bryan M. Li
Muhammad Osama
Lukasz Kaiser
GAN
64
59
0
15 Jan 2018
Fix your classifier: the marginal value of training the last weight layer
Elad Hoffer
Itay Hubara
Daniel Soudry
156
102
0
14 Jan 2018
Improved English to Russian Translation by Neural Suffix Prediction
Kai Song
Yue Zhang
Min Zhang
Weihua Luo
43
10
0
11 Jan 2018
PixelSNAIL: An Improved Autoregressive Generative Model
Xi Chen
Nikhil Mishra
Mostafa Rohaninejad
Pieter Abbeel
DRL
DiffM
BDL
GAN
80
276
0
28 Dec 2017
A Flexible Approach to Automated RNN Architecture Generation
Martin Schrimpf
Stephen Merity
James Bradbury
R. Socher
59
16
0
20 Dec 2017
Sockeye: A Toolkit for Neural Machine Translation
Felix Hieber
Tobias Domhan
Michael J. Denkowski
David Vilar
Artem Sokolov
Ann Clifton
Matt Post
75
215
0
15 Dec 2017
Character-Based Handwritten Text Transcription with Attention Networks
Jason Poulos
Rafael Valle
49
32
0
11 Dec 2017
Stochastic Answer Networks for Machine Reading Comprehension
Xiaodong Liu
Yelong Shen
Kevin Duh
Jianfeng Gao
RALM
77
198
0
10 Dec 2017
Multi-channel Encoder for Neural Machine Translation
Hao Xiong
Zhongjun He
Xiaoguang Hu
Hua Wu
76
34
0
06 Dec 2017
Distance-based Self-Attention Network for Natural Language Inference
Jinbae Im
Sungzoon Cho
87
76
0
06 Dec 2017
Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks
Salman Mohammed
Peng Shi
Jimmy J. Lin
85
106
0
05 Dec 2017
Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models
Rohit Prabhavalkar
Tara N. Sainath
Yonghui Wu
Patrick Nguyen
Zhiwen Chen
Chung-Cheng Chiu
Anjuli Kannan
77
162
0
05 Dec 2017
Improving the Performance of Online Neural Transducer Models
Tara N. Sainath
Chung-Cheng Chiu
Rohit Prabhavalkar
Anjuli Kannan
Yonghui Wu
Patrick Nguyen
Zhiwen Chen
AI4TS
97
49
0
05 Dec 2017
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
...
Katya Gonina
Navdeep Jaitly
Yue Liu
J. Chorowski
M. Bacchiani
AI4TS
125
1,155
0
05 Dec 2017
Deep Semantic Role Labeling with Self-Attention
Zhixing Tan
Mingxuan Wang
Jun Xie
Yidong Chen
X. Shi
87
311
0
05 Dec 2017
Relation Networks for Object Detection
Han Hu
Jiayuan Gu
Zheng Zhang
Jifeng Dai
Yichen Wei
ObjD
148
1,227
0
30 Nov 2017
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GAN
ViT
131
1,725
0
28 Nov 2017
Population Based Training of Neural Networks
Max Jaderberg
Valentin Dalibard
Simon Osindero
Wojciech M. Czarnecki
Jeff Donahue
...
Tim Green
Iain Dunning
Karen Simonyan
Chrisantha Fernando
Koray Kavukcuoglu
98
745
0
27 Nov 2017
Neural Text Generation: A Practical Guide
Ziang Xie
57
46
0
27 Nov 2017
SkipNet: Learning Dynamic Routing in Convolutional Networks
Xin Wang
Feng Yu
Zi-Yi Dou
Trevor Darrell
Joseph E. Gonzalez
119
640
0
26 Nov 2017
Convolutional Image Captioning
J. Aneja
Aditya Deshpande
Alex Schwing
VLM
137
361
0
24 Nov 2017
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
364
8,931
0
21 Nov 2017
Speech recognition for medical conversations
Chung-Cheng Chiu
Anshuman Tripathi
Katherine Chou
Chris Co
Navdeep Jaitly
...
Ananth Sankar
Justin Tansuwan
Nathan Wan
Yonghui Wu
Xuedong Zhang
LM&MA
70
84
0
20 Nov 2017
ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation
Chang Zhou
Jinze Bai
Junshuai Song
Xiaofei Liu
Zhengchao Zhao
Xiusi Chen
Jun Gao
HAI
95
309
0
17 Nov 2017
Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method
Xu Sun
Xuancheng Ren
Shuming Ma
Bingzhen Wei
Wei Li
Jingjing Xu
Houfeng Wang
Yi Zhang
56
24
0
17 Nov 2017
Image Matters: Visually modeling user behaviors using Advanced Model Server
T. Ge
Liqin Zhao
Guorui Zhou
Keyu Chen
Shuying Liu
...
Sui Huang
Qing Cui
Xiaoqiang Zhu
Yu Zhang
Kun Gai
80
41
0
17 Nov 2017
Attend and Interact: Higher-Order Object Interactions for Video Understanding
Chih-Yao Ma
Asim Kadav
I. Melvin
Z. Kira
G. Al-Regib
H. Graf
80
145
0
16 Nov 2017
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension
Hsin-Yuan Huang
Chenguang Zhu
Yelong Shen
Weizhu Chen
FedML
87
183
0
16 Nov 2017
Motif-based Convolutional Neural Network on Graphs
Aravind Sankar
Xinyang Zhang
Kevin Chen-Chuan Chang
GNN
87
42
0
15 Nov 2017
Controllable Abstractive Summarization
Angela Fan
David Grangier
Michael Auli
103
312
0
14 Nov 2017
Classical Structured Prediction Losses for Sequence to Sequence Learning
Sergey Edunov
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
AIMat
120
186
0
14 Nov 2017
QuickEdit: Editing Text & Translations by Crossing Words Out
David Grangier
Michael Auli
KELM
61
10
0
13 Nov 2017
Previous
1
2
3
...
536
537
538
539
Next