Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 19,017 papers shown
Title
Capacity Control of ReLU Neural Networks by Basis-path Norm
Shuxin Zheng
Qi Meng
Huishuai Zhang
Wei-neng Chen
Nenghai Yu
Tie-Yan Liu
24
23
0
19 Sep 2018
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu
Xiaodong Liu
Yelong Shen
Jingjing Liu
Jianfeng Gao
23
51
0
18 Sep 2018
Adaptive Sampling Towards Fast Graph Representation Learning
Wen-bing Huang
Tong Zhang
Yu Rong
Junzhou Huang
GNN
22
487
0
14 Sep 2018
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles
Tianze Shi
Kedar Tatwawadi
K. Chakrabarti
Yi Mao
Oleksandr Polozov
Weizhu Chen
23
64
0
13 Sep 2018
Deep learning for time series classification: a review
Hassan Ismail Fawaz
Germain Forestier
J. Weber
L. Idoumghar
Pierre-Alain Muller
AI4TS
AI4CE
136
2,650
0
12 Sep 2018
Music Transformer
Cheng-Zhi Anna Huang
Ashish Vaswani
Jakob Uszkoreit
Noam M. Shazeer
Ian Simon
Curtis Hawthorne
Andrew M. Dai
Matthew D. Hoffman
Monica Dinculescu
Douglas Eck
54
472
0
12 Sep 2018
What can linguistics and deep learning contribute to each other?
Tal Linzen
21
40
0
11 Sep 2018
On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation
Tamer Alkhouli
Gabriel Bretschner
Hermann Ney
21
63
0
11 Sep 2018
Dual Attention Network for Scene Segmentation
J. Fu
Jiaheng Liu
Haijie Tian
Yong Li
Yongjun Bao
Zhiwei Fang
Hanqing Lu
SSeg
98
5,060
0
09 Sep 2018
Neural Machine Translation of Logographic Languages Using Sub-character Level Information
Longtu Zhang
Mamoru Komachi
32
43
0
07 Sep 2018
Deep Learning for Generic Object Detection: A Survey
Li Liu
Wanli Ouyang
Xiaogang Wang
Paul Fieguth
Jie Chen
Xinwang Liu
M. Pietikäinen
ObjD
VLM
OOD
77
2,427
0
06 Sep 2018
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
27
688
0
06 Sep 2018
Training Millions of Personalized Dialogue Agents
Pierre-Emmanuel Mazaré
Samuel Humeau
Martin Raison
Antoine Bordes
37
266
0
06 Sep 2018
Deep Relevance Ranking Using Enhanced Document-Query Interactions
Ryan T. McDonald
George Brokos
Ion Androutsopoulos
19
126
0
05 Sep 2018
Recurrent World Models Facilitate Policy Evolution
David R Ha
Jürgen Schmidhuber
SyDa
TPM
52
920
0
04 Sep 2018
OCNet: Object Context Network for Scene Parsing
Yuhui Yuan
Lang Huang
Jianyuan Guo
Chao Zhang
Xilin Chen
Jingdong Wang
25
600
0
04 Sep 2018
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu
Haoran Shi
Bowen Tan
Wentao Wang
Zichao Yang
...
Zhengzhong Liu
Xiaodan Liang
Wangrong Zhu
Devendra Singh Sachan
Eric Xing
VLM
25
56
0
04 Sep 2018
Convolutional Neural Network for Trajectory Prediction
N. Nikhil
B. Morris
HAI
27
56
0
03 Sep 2018
Trivial Transfer Learning for Low-Resource Neural Machine Translation
Tom Kocmi
Ondrej Bojar
33
171
0
02 Sep 2018
Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter
Lijun Wu
Xu Tan
Di He
Fei Tian
Tao Qin
Jianhuang Lai
Tie-Yan Liu
18
48
0
01 Sep 2018
Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection
Wei Wang
Taro Watanabe
Macduff Hughes
Tetsuji Nakagawa
Ciprian Chelba
38
91
0
31 Aug 2018
Spherical Latent Spaces for Stable Variational Autoencoders
Jiacheng Xu
Greg Durrett
BDL
DRL
14
193
0
31 Aug 2018
Beyond Weight Tying: Learning Joint Input-Output Embeddings for Neural Machine Translation
Nikolaos Pappas
Lesly Miculicich
James Henderson
18
16
0
31 Aug 2018
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
Chen Gao
Yuliang Zou
Jia-Bin Huang
19
294
0
30 Aug 2018
An Operation Sequence Model for Explainable Neural Machine Translation
Felix Stahlberg
Danielle Saunders
Bill Byrne
LRM
MILM
40
29
0
29 Aug 2018
Targeted Syntactic Evaluation of Language Models
Rebecca Marvin
Tal Linzen
31
408
0
27 Aug 2018
Dissecting Contextual Word Embeddings: Architecture and Representation
Matthew E. Peters
Mark Neumann
Luke Zettlemoyer
Wen-tau Yih
35
428
0
27 Aug 2018
A neural attention model for speech command recognition
Douglas Coimbra de Andrade
Sabato Leo
M. Viana
Christoph Bernkopf
19
145
0
27 Aug 2018
A Study of Reinforcement Learning for Neural Machine Translation
Lijun Wu
Fei Tian
Tao Qin
Jianhuang Lai
Tie-Yan Liu
OffRL
27
182
0
27 Aug 2018
Training Deeper Neural Machine Translation Models with Transparent Attention
Ankur Bapna
Mengzhao Chen
Orhan Firat
Yuan Cao
Yonghui Wu
29
138
0
22 Aug 2018
SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation
Xinyi Wang
Hieu H. Pham
Zihang Dai
Graham Neubig
21
195
0
22 Aug 2018
XAI Beyond Classification: Interpretable Neural Clustering
Xi Peng
Yunfan Li
Ivor W. Tsang
Erik Cambria
Jiancheng Lv
Qiufeng Wang
29
74
0
22 Aug 2018
Are You Tampering With My Data?
Michele Alberti
Vinaychandran Pondenkandath
Marcel Würsch
Manuel Bouillon
Mathias Seuret
Rolf Ingold
Marcus Liwicki
AAML
37
19
0
21 Aug 2018
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
88
3,475
0
19 Aug 2018
Improving Conditional Sequence Generative Adversarial Networks by Stepwise Evaluation
Yi-Lin Tuan
Hung-yi Lee
GAN
30
55
0
16 Aug 2018
Temporal Sequence Distillation: Towards Few-Frame Action Recognition in Videos
Zhaoyang Zhang
Zhanghui Kuang
Ping Luo
Xue Jiang
Wayne Zhang
19
12
0
15 Aug 2018
Regularizing Neural Machine Translation by Target-bidirectional Agreement
Zhirui Zhang
Shuo Ren
Shujie Liu
Mu Li
M. Zhou
Tong Xu
37
116
0
13 Aug 2018
Large-Scale Learnable Graph Convolutional Networks
Hongyang Gao
Zhengyang Wang
Shuiwang Ji
GNN
33
588
0
12 Aug 2018
Neural Network Encapsulation
Hongyang Li
Xiaoyang Guo
Bo Dai
Wanli Ouyang
Xiaogang Wang
21
51
0
11 Aug 2018
Ancient-Modern Chinese Translation with a Large Training Dataset
Dayiheng Liu
Jiancheng Lv
Kexin Yang
Qian Qu
24
13
0
11 Aug 2018
Large Scale Language Modeling: Converging on 40GB of Text in Four Hours
Raul Puri
Robert M. Kirby
Nikolai Yakovenko
Bryan Catanzaro
19
29
0
03 Aug 2018
Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification
Yang Du
Chunfen Yuan
Bing Li
Lili Zhao
Yangxi Li
Weiming Hu
81
79
0
03 Aug 2018
Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction
Bo Yang
Sen Wang
Andrew Markham
Niki Trigoni
3DPC
3DV
29
138
0
02 Aug 2018
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
Mandy Guo
Qinlan Shen
Yinfei Yang
Heming Ge
Daniel Cer
...
K. Stevens
Noah Constant
Yun-hsuan Sung
B. Strope
R. Kurzweil
53
111
0
31 Jul 2018
Doubly Attentive Transformer Machine Translation
Hasan Sait Arslan
Mark Fishel
G. Anbarjafari
35
13
0
30 Jul 2018
Active Learning for Interactive Neural Machine Translation of Data Streams
Álvaro Peris
F. Casacuberta
AI4CE
37
60
0
30 Jul 2018
"Bilingual Expert" Can Find Translation Errors
Kai Fan
Jiayi Wang
Bo Li
Fengming Zhou
Boxing Chen
Luo Si
MoE
19
57
0
25 Jul 2018
Zero-shot keyword spotting for visual speech recognition in-the-wild
Themos Stafylakis
Georgios Tzimiropoulos
35
38
0
23 Jul 2018
SCAN: Self-and-Collaborative Attention Network for Video Person Re-identification
Ruimao Zhang
Hongbin Sun
Jingyu Li
Yuying Ge
Liang Lin
Ping Luo
Xiaogang Wang
25
75
0
16 Jul 2018
Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking
Shikhar Murty
Pat Verga
Luke Vilnis
Irena Radovanovic
Andrew McCallum
32
91
0
13 Jul 2018
Previous
1
2
3
...
376
377
378
379
380
381
Next