Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 18,521 papers shown
Title
AffinityNet: semi-supervised few-shot learning for disease type prediction
Tianle Ma
A. Zhang
24
55
0
22 May 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
27
44
0
22 May 2018
Sparse and Constrained Attention for Neural Machine Translation
Chaitanya Malaviya
Pedro Ferreira
André F. T. Martins
22
62
0
21 May 2018
Global-Locally Self-Attentive Dialogue State Tracker
Victor Zhong
Caiming Xiong
R. Socher
13
188
0
19 May 2018
Combining Advanced Methods in Japanese-Vietnamese Neural Machine Translation
Thi-Vinh Ngo
Thanh-Le Ha
Phuong-Thai Nguyen
Le-Minh Nguyen
24
8
0
18 May 2018
Cross-Target Stance Classification with Self-Attention Networks
Chang Xu
Cécile Paris
Surya Nepal
R. Sparks
OOD
20
128
0
17 May 2018
Towards Robust Neural Machine Translation
Yong Cheng
Zhaopeng Tu
Fandong Meng
Junjie Zhai
Yang Liu
AAML
25
161
0
16 May 2018
RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition
Albert Zeyer
Tamer Alkhouli
Hermann Ney
37
90
0
14 May 2018
Bag-of-Words as Target for Neural Machine Translation
Shuming Ma
Xu Sun
Yizhong Wang
Junyang Lin
3DV
16
76
0
13 May 2018
Hierarchical Neural Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
DiffM
60
1,592
0
13 May 2018
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
Luheng He
Kenton Lee
Omer Levy
Luke Zettlemoyer
19
188
0
12 May 2018
Deep Nets: What have they ever done for Vision?
Alan Yuille
Chenxi Liu
28
100
0
10 May 2018
Global Encoding for Abstractive Summarization
Junyang Lin
Xu Sun
Shuming Ma
Qi Su
21
146
0
10 May 2018
Neural Machine Translation Decoding with Terminology Constraints
Eva Hasler
Adria de Gispert
Gonzalo Iglesias
Bill Byrne
AI4CE
26
108
0
09 May 2018
Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders
F. Bianchi
L. Livi
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
AI4TS
30
11
0
09 May 2018
Reasoning with Sarcasm by Reading In-between
Yi Tay
Anh Tuan Luu
S. Hui
Jian Su
LRM
32
172
0
08 May 2018
Transformer for Emotion Recognition
Jean-Benoit Delbrouck
20
1
0
03 May 2018
Facial Landmarks Localization using Cascaded Neural Networks
Shahar Mahpod
Rig Das
E. Maiorana
Y. Keller
P. Campisi
CVBM
19
18
0
03 May 2018
Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge
Ziqi Zheng
Chenjie Cao
Xingwei Chen
Guoqiang Xu
38
19
0
03 May 2018
Constituency Parsing with a Self-Attentive Encoder
Nikita Kitaev
Dan Klein
30
535
0
02 May 2018
Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Chengqi Zhang
43
14
0
02 May 2018
Accelerating Neural Transformer via an Average Attention Network
Biao Zhang
Deyi Xiong
Jinsong Su
27
120
0
02 May 2018
Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT
Danielle Saunders
Felix Stahlberg
Adria de Gispert
Bill Byrne
27
25
0
01 May 2018
Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation
Rui Wang
Masao Utiyama
Eiichiro Sumita
40
27
0
01 May 2018
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Taku Kudo
51
1,147
0
29 Apr 2018
Improving Entity Linking by Modeling Latent Relations between Mentions
Phong Le
Ivan Titov
KELM
19
201
0
27 Apr 2018
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Hendrik Strobelt
Sebastian Gehrmann
M. Behrisch
Adam Perer
Hanspeter Pfister
Alexander M. Rush
VLM
HAI
31
239
0
25 Apr 2018
Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks with Existing Applications
Guy Hadash
Einat Kermany
Boaz Carmeli
Ofer Lavi
George Kour
Alon Jacovi
AI4TS
27
42
0
24 Apr 2018
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Adams Wei Yu
David Dohan
Minh-Thang Luong
Rui Zhao
Kai Chen
Mohammad Norouzi
Quoc V. Le
RALM
AIMat
35
1,092
0
23 Apr 2018
A neural interlingua for multilingual machine translation
Y. Lu
Phillip Keung
Faisal Ladhak
Vikas Bhardwaj
Shaonan Zhang
Jason Sun
AI4CE
34
125
0
23 Apr 2018
Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks
Renjie Zheng
Junkun Chen
Xipeng Qiu
29
30
0
22 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,996
0
20 Apr 2018
Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
Matt Post
David Vilar
33
311
0
18 Apr 2018
Investigating Backtranslation in Neural Machine Translation
Alberto Poncelas
D. Shterionov
Andy Way
Gideon Maillette de Buy Wenniger
Peyman Passban
33
143
0
17 Apr 2018
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task
Marcin Junczys-Dowmunt
Roman Grundkiewicz
Shubha Guha
Kenneth Heafield
33
192
0
16 Apr 2018
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
Arman Cohan
Franck Dernoncourt
Doo Soon Kim
Trung Bui
Seokhwan Kim
W. Chang
Nazli Goharian
51
743
0
16 Apr 2018
Interact and Decide: Medley of Sub-Attention Networks for Effective Group Recommendation
Lucas Vinh Tran
T. Pham
Yi Tay
Yiding Liu
Gao Cong
Xiaoli Li
27
93
0
12 Apr 2018
Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation
Hyojin Bahng
Seungjoo Yoo
Wonwoong Cho
D. Park
Ziming Wu
Xiaojuan Ma
Jaegul Choo
VLM
24
86
0
11 Apr 2018
Attention U-Net: Learning Where to Look for the Pancreas
Ozan Oktay
Jo Schlemper
Loic Le Folgoc
M. J. Lee
M. Heinrich
...
Jingyu Sun
Nils Y. Hammerla
Bernhard Kainz
Ben Glocker
Daniel Rueckert
SSeg
39
4,949
0
11 Apr 2018
Gotta Learn Fast: A New Benchmark for Generalization in RL
Alex Nichol
Vicki Pfau
Christopher Hesse
Oleg Klimov
John Schulman
VLM
OffRL
15
177
0
10 Apr 2018
Real-Time Prediction of the Duration of Distribution System Outages
Aaron Jaech
Baosen Zhang
Mari Ostendorf
D. Kirschen
16
74
0
03 Apr 2018
Probing Physics Knowledge Using Tools from Developmental Psychology
Luis S. Piloto
Ari Weinstein
TB Dhruva
Arun Ahuja
M. Berk Mirza
Greg Wayne
David Amos
Chia-Chun Hung
M. Botvinick
30
34
0
03 Apr 2018
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Duy-Kien Nguyen
Takayuki Okatani
27
279
0
03 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
46
1,479
0
30 Mar 2018
Universal Sentence Encoder
Daniel Cer
Yinfei Yang
Sheng-yi Kong
Nan Hua
Nicole Limtiaco
...
Steve Yuan
Chris Tar
Yun-hsuan Sung
B. Strope
R. Kurzweil
21
1,889
0
29 Mar 2018
World Models
David R Ha
Jürgen Schmidhuber
SyDa
32
1,031
0
27 Mar 2018
Learning the Multiple Traveling Salesmen Problem with Permutation Invariant Pooling Networks
Yoav Kaempfer
Lior Wolf
27
71
0
26 Mar 2018
Self-Attentional Acoustic Models
Matthias Sperber
Jan Niehues
Graham Neubig
Sebastian Stüker
A. Waibel
22
151
0
26 Mar 2018
code2vec: Learning Distributed Representations of Code
Uri Alon
Meital Zilberstein
Omer Levy
Eran Yahav
14
1,157
0
26 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
26
815
0
23 Mar 2018
Previous
1
2
3
...
368
369
370
371
Next