ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 19,430 papers shown
Title
Large-Scale Learnable Graph Convolutional Networks
Large-Scale Learnable Graph Convolutional Networks
Hongyang Gao
Zhengyang Wang
Shuiwang Ji
GNN
33
588
0
12 Aug 2018
Neural Network Encapsulation
Neural Network Encapsulation
Hongyang Li
Xiaoyang Guo
Bo Dai
Wanli Ouyang
Xiaogang Wang
21
51
0
11 Aug 2018
Ancient-Modern Chinese Translation with a Large Training Dataset
Ancient-Modern Chinese Translation with a Large Training Dataset
Dayiheng Liu
Jiancheng Lv
Kexin Yang
Qian Qu
24
13
0
11 Aug 2018
Large Scale Language Modeling: Converging on 40GB of Text in Four Hours
Large Scale Language Modeling: Converging on 40GB of Text in Four Hours
Raul Puri
Robert M. Kirby
Nikolai Yakovenko
Bryan Catanzaro
19
29
0
03 Aug 2018
Interaction-aware Spatio-temporal Pyramid Attention Networks for Action
  Classification
Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification
Yang Du
Chunfen Yuan
Bing Li
Lili Zhao
Yangxi Li
Weiming Hu
81
79
0
03 Aug 2018
Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D
  Reconstruction
Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction
Bo Yang
Sen Wang
Andrew Markham
Niki Trigoni
3DPC
3DV
29
138
0
02 Aug 2018
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
Mandy Guo
Qinlan Shen
Yinfei Yang
Heming Ge
Daniel Cer
...
K. Stevens
Noah Constant
Yun-hsuan Sung
B. Strope
R. Kurzweil
53
111
0
31 Jul 2018
Doubly Attentive Transformer Machine Translation
Doubly Attentive Transformer Machine Translation
Hasan Sait Arslan
Mark Fishel
G. Anbarjafari
35
13
0
30 Jul 2018
Active Learning for Interactive Neural Machine Translation of Data
  Streams
Active Learning for Interactive Neural Machine Translation of Data Streams
Álvaro Peris
F. Casacuberta
AI4CE
37
60
0
30 Jul 2018
"Bilingual Expert" Can Find Translation Errors
"Bilingual Expert" Can Find Translation Errors
Kai Fan
Jiayi Wang
Yue Liu
Fengming Zhou
Boxing Chen
Luo Si
MoE
27
57
0
25 Jul 2018
Zero-shot keyword spotting for visual speech recognition in-the-wild
Zero-shot keyword spotting for visual speech recognition in-the-wild
Themos Stafylakis
Georgios Tzimiropoulos
35
38
0
23 Jul 2018
SCAN: Self-and-Collaborative Attention Network for Video Person
  Re-identification
SCAN: Self-and-Collaborative Attention Network for Video Person Re-identification
Ruimao Zhang
Hongbin Sun
Jingyu Li
Yuying Ge
Liang Lin
Ping Luo
Xiaogang Wang
27
75
0
16 Jul 2018
Hierarchical Losses and New Resources for Fine-grained Entity Typing and
  Linking
Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking
Shikhar Murty
Pat Verga
Luke Vilnis
Irena Radovanovic
Andrew McCallum
32
91
0
13 Jul 2018
Video-based Person Re-identification via 3D Convolutional Networks and
  Non-local Attention
Video-based Person Re-identification via 3D Convolutional Networks and Non-local Attention
Xingyu Liao
Lingxiao He
Zhouwang Yang
Chi Zhang
3DPC
30
72
0
12 Jul 2018
Universal Transformers
Universal Transformers
Mostafa Dehghani
Stephan Gouws
Oriol Vinyals
Jakob Uszkoreit
Lukasz Kaiser
42
745
0
10 Jul 2018
NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and
  Online Learning
NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning
Álvaro Peris
F. Casacuberta
KELM
HAI
42
21
0
09 Jul 2018
Position-aware Self-attention with Relative Positional Encodings for
  Slot Filling
Position-aware Self-attention with Relative Positional Encodings for Slot Filling
I. Bilan
Benjamin Roth
26
22
0
09 Jul 2018
Neural Machine Translation with Key-Value Memory-Augmented Attention
Neural Machine Translation with Key-Value Memory-Augmented Attention
Fandong Meng
Zhaopeng Tu
Yong Cheng
Haiyang Wu
Junjie Zhai
Yuekui Yang
Di Wang
29
20
0
29 Jun 2018
Enhancing Sentence Embedding with Generalized Pooling
Enhancing Sentence Embedding with Generalized Pooling
Qian Chen
Zhenhua Ling
Xiao-Dan Zhu
30
68
0
26 Jun 2018
Understanding Dropout as an Optimization Trick
Understanding Dropout as an Optimization Trick
Sangchul Hahn
Heeyoul Choi
ODL
13
34
0
26 Jun 2018
Neural Machine Translation for Low Resource Languages using Bilingual
  Lexicon Induced from Comparable Corpora
Neural Machine Translation for Low Resource Languages using Bilingual Lexicon Induced from Comparable Corpora
Sree Harsha Ramesh
Krishnamurthy Sankaranarayanan
22
36
0
25 Jun 2018
Focusing on What is Relevant: Time-Series Learning and Understanding
  using Attention
Focusing on What is Relevant: Time-Series Learning and Understanding using Attention
Phongtharin Vinayavekhin
Subhajit Chaudhury
Asim Munawar
Don Joven Agravante
Giovanni De Magistris
Daiki Kimura
Ryuki Tachibana
AI4TS
24
24
0
22 Jun 2018
A Comparison of Transformer and Recurrent Neural Networks on
  Multilingual Neural Machine Translation
A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation
Surafel Melaku Lakew
Mauro Cettolo
Marcello Federico
33
103
0
18 Jun 2018
Multi-variable LSTM neural network for autoregressive exogenous model
Multi-variable LSTM neural network for autoregressive exogenous model
Tian Guo
Tao R. Lin
BDL
AI4TS
48
19
0
17 Jun 2018
Evaluation of sentence embeddings in downstream and linguistic probing
  tasks
Evaluation of sentence embeddings in downstream and linguistic probing tasks
C. Perone
Roberto Silveira
Thomas S. Paula
ELM
44
154
0
16 Jun 2018
An Evaluation of Neural Machine Translation Models on Historical
  Spelling Normalization
An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization
Gongbo Tang
Fabienne Cap
Eva Pettersson
Joakim Nivre
19
43
0
13 Jun 2018
On Accurate Evaluation of GANs for Language Generation
On Accurate Evaluation of GANs for Language Generation
Stanislau Semeniuta
Aliaksei Severyn
Sylvain Gelly
EGVM
39
81
0
13 Jun 2018
Double Path Networks for Sequence to Sequence Learning
Double Path Networks for Sequence to Sequence Learning
Kaitao Song
Xu Tan
Di He
Jianfeng Lu
Tao Qin
Tie-Yan Liu
35
14
0
13 Jun 2018
Let's do it "again": A First Computational Approach to Detecting
  Adverbial Presupposition Triggers
Let's do it "again": A First Computational Approach to Detecting Adverbial Presupposition Triggers
Andre Cianflone
Yulan Feng
Jad Kabbara
Jackie C.K. Cheung
64
9
0
11 Jun 2018
Navigating with Graph Representations for Fast and Scalable Decoding of
  Neural Language Models
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang
Xiaodong Liu
Wenhan Wang
Jianfeng Gao
Yuxiong He
23
30
0
11 Jun 2018
Straight to the Tree: Constituency Parsing with Neural Syntactic
  Distance
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
Songlin Yang
Zhouhan Lin
Athul Paul Jacob
Alessandro Sordoni
Aaron Courville
Yoshua Bengio
25
91
0
11 Jun 2018
Findings of the Second Workshop on Neural Machine Translation and
  Generation
Findings of the Second Workshop on Neural Machine Translation and Generation
Alexandra Birch
A. Finch
Minh-Thang Luong
Graham Neubig
Yusuke Oda
DRL
40
12
0
08 Jun 2018
A Simple Method for Commonsense Reasoning
A Simple Method for Commonsense Reasoning
Trieu H. Trinh
Quoc V. Le
LRM
ReLM
60
432
0
07 Jun 2018
Relational Deep Reinforcement Learning
Relational Deep Reinforcement Learning
V. Zambaldi
David Raposo
Adam Santoro
V. Bapst
Yujia Li
...
Victoria Langston
Razvan Pascanu
M. Botvinick
Oriol Vinyals
Peter W. Battaglia
OffRL
24
219
0
05 Jun 2018
Relational recurrent neural networks
Relational recurrent neural networks
Adam Santoro
Ryan Faulkner
David Raposo
Jack W. Rae
Mike Chrzanowski
T. Weber
Daan Wierstra
Oriol Vinyals
Razvan Pascanu
Timothy Lillicrap
GNN
30
209
0
05 Jun 2018
Videos as Space-Time Region Graphs
Videos as Space-Time Region Graphs
Xinyu Wang
Abhinav Gupta
36
752
0
05 Jun 2018
Relational inductive biases, deep learning, and graph networks
Relational inductive biases, deep learning, and graph networks
Peter W. Battaglia
Jessica B. Hamrick
V. Bapst
Alvaro Sanchez-Gonzalez
V. Zambaldi
...
Pushmeet Kohli
M. Botvinick
Oriol Vinyals
Yujia Li
Razvan Pascanu
AI4CE
NAI
121
3,088
0
04 Jun 2018
On the Importance of Attention in Meta-Learning for Few-Shot Text
  Classification
On the Importance of Attention in Meta-Learning for Few-Shot Text Classification
Xiang Jiang
Mohammad Havaei
Gabriel Chartrand
Hassan Chouaib
Thomas Vincent
Andrew Jesson
Nicolas Chapados
Stan Matwin
VLM
16
18
0
03 Jun 2018
Scaling Neural Machine Translation
Scaling Neural Machine Translation
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
76
610
0
01 Jun 2018
Explaining Explanations: An Overview of Interpretability of Machine
  Learning
Explaining Explanations: An Overview of Interpretability of Machine Learning
Leilani H. Gilpin
David Bau
Ben Z. Yuan
Ayesha Bajwa
Michael A. Specter
Lalana Kagal
XAI
40
1,844
0
31 May 2018
On the Impact of Various Types of Noise on Neural Machine Translation
On the Impact of Various Types of Noise on Neural Machine Translation
Huda Khayrallah
Philipp Koehn
AAML
34
218
0
31 May 2018
Theory and Experiments on Vector Quantized Autoencoders
Theory and Experiments on Vector Quantized Autoencoders
Aurko Roy
Ashish Vaswani
Arvind Neelakantan
Niki Parmar
21
85
0
28 May 2018
Lipschitz regularity of deep neural networks: analysis and efficient
  estimation
Lipschitz regularity of deep neural networks: analysis and efficient estimation
Kevin Scaman
Aladin Virmaux
40
518
0
28 May 2018
A Stochastic Decoder for Neural Machine Translation
A Stochastic Decoder for Neural Machine Translation
P. Schulz
Wilker Aziz
Trevor Cohn
BDL
32
29
0
28 May 2018
OpenNMT: Neural Machine Translation Toolkit
OpenNMT: Neural Machine Translation Toolkit
Guillaume Klein
Yoon Kim
Yuntian Deng
Vincent Nguyen
Jean Senellart
Alexander M. Rush
152
119
0
28 May 2018
Convolutional neural networks for chemical-disease relation extraction
  are improved with character-based word embeddings
Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings
Dat Quoc Nguyen
Karin Verspoor
NAI
MedIm
16
46
0
27 May 2018
Stable Recurrent Models
Stable Recurrent Models
John Miller
Moritz Hardt
21
116
0
25 May 2018
Zero-Shot Dual Machine Translation
Zero-Shot Dual Machine Translation
L. Sestorain
Massimiliano Ciaramita
Christian Buck
Thomas Hofmann
39
23
0
25 May 2018
Parallel Architecture and Hyperparameter Search via Successive Halving
  and Classification
Parallel Architecture and Hyperparameter Search via Successive Halving and Classification
Manoj Kumar
George E. Dahl
Vijay Vasudevan
Mohammad Norouzi
36
25
0
25 May 2018
Phrase Table as Recommendation Memory for Neural Machine Translation
Phrase Table as Recommendation Memory for Neural Machine Translation
Yang Zhao
Yining Wang
Jiajun Zhang
Chengqing Zong
42
30
0
25 May 2018
Previous
123...385386387388389
Next