ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 19,017 papers shown
Title
Video-based Person Re-identification via 3D Convolutional Networks and
  Non-local Attention
Video-based Person Re-identification via 3D Convolutional Networks and Non-local Attention
Xingyu Liao
Lingxiao He
Zhouwang Yang
Chi Zhang
3DPC
30
72
0
12 Jul 2018
Universal Transformers
Universal Transformers
Mostafa Dehghani
Stephan Gouws
Oriol Vinyals
Jakob Uszkoreit
Lukasz Kaiser
37
745
0
10 Jul 2018
NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and
  Online Learning
NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning
Álvaro Peris
F. Casacuberta
KELM
HAI
35
21
0
09 Jul 2018
Position-aware Self-attention with Relative Positional Encodings for
  Slot Filling
Position-aware Self-attention with Relative Positional Encodings for Slot Filling
I. Bilan
Benjamin Roth
23
22
0
09 Jul 2018
Neural Machine Translation with Key-Value Memory-Augmented Attention
Neural Machine Translation with Key-Value Memory-Augmented Attention
Fandong Meng
Zhaopeng Tu
Yong Cheng
Haiyang Wu
Junjie Zhai
Yuekui Yang
Di Wang
29
20
0
29 Jun 2018
Enhancing Sentence Embedding with Generalized Pooling
Enhancing Sentence Embedding with Generalized Pooling
Qian Chen
Zhenhua Ling
Xiao-Dan Zhu
30
68
0
26 Jun 2018
Understanding Dropout as an Optimization Trick
Understanding Dropout as an Optimization Trick
Sangchul Hahn
Heeyoul Choi
ODL
13
34
0
26 Jun 2018
Neural Machine Translation for Low Resource Languages using Bilingual
  Lexicon Induced from Comparable Corpora
Neural Machine Translation for Low Resource Languages using Bilingual Lexicon Induced from Comparable Corpora
Sree Harsha Ramesh
Krishnamurthy Sankaranarayanan
22
36
0
25 Jun 2018
Focusing on What is Relevant: Time-Series Learning and Understanding
  using Attention
Focusing on What is Relevant: Time-Series Learning and Understanding using Attention
Phongtharin Vinayavekhin
Subhajit Chaudhury
Asim Munawar
Don Joven Agravante
Giovanni De Magistris
Daiki Kimura
Ryuki Tachibana
AI4TS
16
24
0
22 Jun 2018
A Comparison of Transformer and Recurrent Neural Networks on
  Multilingual Neural Machine Translation
A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation
Surafel Melaku Lakew
Mauro Cettolo
Marcello Federico
33
103
0
18 Jun 2018
Multi-variable LSTM neural network for autoregressive exogenous model
Multi-variable LSTM neural network for autoregressive exogenous model
Tian Guo
Tao R. Lin
BDL
AI4TS
48
19
0
17 Jun 2018
Evaluation of sentence embeddings in downstream and linguistic probing
  tasks
Evaluation of sentence embeddings in downstream and linguistic probing tasks
C. Perone
Roberto Silveira
Thomas S. Paula
ELM
44
154
0
16 Jun 2018
An Evaluation of Neural Machine Translation Models on Historical
  Spelling Normalization
An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization
Gongbo Tang
Fabienne Cap
Eva Pettersson
Joakim Nivre
19
43
0
13 Jun 2018
On Accurate Evaluation of GANs for Language Generation
On Accurate Evaluation of GANs for Language Generation
Stanislau Semeniuta
Aliaksei Severyn
Sylvain Gelly
EGVM
39
81
0
13 Jun 2018
Double Path Networks for Sequence to Sequence Learning
Double Path Networks for Sequence to Sequence Learning
Kaitao Song
Xu Tan
Di He
Jianfeng Lu
Tao Qin
Tie-Yan Liu
35
14
0
13 Jun 2018
Let's do it "again": A First Computational Approach to Detecting
  Adverbial Presupposition Triggers
Let's do it "again": A First Computational Approach to Detecting Adverbial Presupposition Triggers
Andre Cianflone
Yulan Feng
Jad Kabbara
Jackie C.K. Cheung
64
9
0
11 Jun 2018
Navigating with Graph Representations for Fast and Scalable Decoding of
  Neural Language Models
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang
Xiaodong Liu
Wenhan Wang
Jianfeng Gao
Yuxiong He
23
30
0
11 Jun 2018
Straight to the Tree: Constituency Parsing with Neural Syntactic
  Distance
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
Songlin Yang
Zhouhan Lin
Athul Paul Jacob
Alessandro Sordoni
Aaron Courville
Yoshua Bengio
25
91
0
11 Jun 2018
Findings of the Second Workshop on Neural Machine Translation and
  Generation
Findings of the Second Workshop on Neural Machine Translation and Generation
Alexandra Birch
A. Finch
Minh-Thang Luong
Graham Neubig
Yusuke Oda
DRL
35
12
0
08 Jun 2018
A Simple Method for Commonsense Reasoning
A Simple Method for Commonsense Reasoning
Trieu H. Trinh
Quoc V. Le
LRM
ReLM
42
432
0
07 Jun 2018
Relational Deep Reinforcement Learning
Relational Deep Reinforcement Learning
V. Zambaldi
David Raposo
Adam Santoro
V. Bapst
Yujia Li
...
Victoria Langston
Razvan Pascanu
M. Botvinick
Oriol Vinyals
Peter W. Battaglia
OffRL
24
219
0
05 Jun 2018
Relational recurrent neural networks
Relational recurrent neural networks
Adam Santoro
Ryan Faulkner
David Raposo
Jack W. Rae
Mike Chrzanowski
T. Weber
Daan Wierstra
Oriol Vinyals
Razvan Pascanu
Timothy Lillicrap
GNN
30
209
0
05 Jun 2018
Videos as Space-Time Region Graphs
Videos as Space-Time Region Graphs
Xinyu Wang
Abhinav Gupta
36
752
0
05 Jun 2018
Relational inductive biases, deep learning, and graph networks
Relational inductive biases, deep learning, and graph networks
Peter W. Battaglia
Jessica B. Hamrick
V. Bapst
Alvaro Sanchez-Gonzalez
V. Zambaldi
...
Pushmeet Kohli
M. Botvinick
Oriol Vinyals
Yujia Li
Razvan Pascanu
AI4CE
NAI
121
3,087
0
04 Jun 2018
On the Importance of Attention in Meta-Learning for Few-Shot Text
  Classification
On the Importance of Attention in Meta-Learning for Few-Shot Text Classification
Xiang Jiang
Mohammad Havaei
Gabriel Chartrand
Hassan Chouaib
Thomas Vincent
Andrew Jesson
Nicolas Chapados
Stan Matwin
VLM
16
18
0
03 Jun 2018
Scaling Neural Machine Translation
Scaling Neural Machine Translation
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
71
610
0
01 Jun 2018
Explaining Explanations: An Overview of Interpretability of Machine
  Learning
Explaining Explanations: An Overview of Interpretability of Machine Learning
Leilani H. Gilpin
David Bau
Ben Z. Yuan
Ayesha Bajwa
Michael A. Specter
Lalana Kagal
XAI
40
1,842
0
31 May 2018
On the Impact of Various Types of Noise on Neural Machine Translation
On the Impact of Various Types of Noise on Neural Machine Translation
Huda Khayrallah
Philipp Koehn
AAML
34
218
0
31 May 2018
Theory and Experiments on Vector Quantized Autoencoders
Theory and Experiments on Vector Quantized Autoencoders
Aurko Roy
Ashish Vaswani
Arvind Neelakantan
Niki Parmar
19
85
0
28 May 2018
Lipschitz regularity of deep neural networks: analysis and efficient
  estimation
Lipschitz regularity of deep neural networks: analysis and efficient estimation
Kevin Scaman
Aladin Virmaux
40
518
0
28 May 2018
A Stochastic Decoder for Neural Machine Translation
A Stochastic Decoder for Neural Machine Translation
P. Schulz
Wilker Aziz
Trevor Cohn
BDL
30
29
0
28 May 2018
OpenNMT: Neural Machine Translation Toolkit
OpenNMT: Neural Machine Translation Toolkit
Guillaume Klein
Yoon Kim
Yuntian Deng
Vincent Nguyen
Jean Senellart
Alexander M. Rush
146
119
0
28 May 2018
Convolutional neural networks for chemical-disease relation extraction
  are improved with character-based word embeddings
Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings
Dat Quoc Nguyen
Karin Verspoor
NAI
MedIm
16
46
0
27 May 2018
Stable Recurrent Models
Stable Recurrent Models
John Miller
Moritz Hardt
21
116
0
25 May 2018
Zero-Shot Dual Machine Translation
Zero-Shot Dual Machine Translation
L. Sestorain
Massimiliano Ciaramita
Christian Buck
Thomas Hofmann
39
23
0
25 May 2018
Parallel Architecture and Hyperparameter Search via Successive Halving
  and Classification
Parallel Architecture and Hyperparameter Search via Successive Halving and Classification
Manoj Kumar
George E. Dahl
Vijay Vasudevan
Mohammad Norouzi
28
25
0
25 May 2018
Phrase Table as Recommendation Memory for Neural Machine Translation
Phrase Table as Recommendation Memory for Neural Machine Translation
Yang Zhao
Yining Wang
Jiajun Zhang
Chengqing Zong
42
30
0
25 May 2018
TADAM: Task dependent adaptive metric for improved few-shot learning
TADAM: Task dependent adaptive metric for improved few-shot learning
Boris N. Oreshkin
Pau Rodríguez López
Alexandre Lacoste
56
1,303
0
23 May 2018
Self-Attention-Based Message-Relevant Response Generation for Neural
  Conversation Model
Self-Attention-Based Message-Relevant Response Generation for Neural Conversation Model
Jonggu Kim
Doyeon Kong
Jong-Hyeok Lee
23
3
0
23 May 2018
AffinityNet: semi-supervised few-shot learning for disease type
  prediction
AffinityNet: semi-supervised few-shot learning for disease type prediction
Tianle Ma
A. Zhang
27
55
0
22 May 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN
  Training
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
27
44
0
22 May 2018
Sparse and Constrained Attention for Neural Machine Translation
Sparse and Constrained Attention for Neural Machine Translation
Chaitanya Malaviya
Pedro Ferreira
André F. T. Martins
22
62
0
21 May 2018
Global-Locally Self-Attentive Dialogue State Tracker
Global-Locally Self-Attentive Dialogue State Tracker
Victor Zhong
Caiming Xiong
R. Socher
13
188
0
19 May 2018
Combining Advanced Methods in Japanese-Vietnamese Neural Machine
  Translation
Combining Advanced Methods in Japanese-Vietnamese Neural Machine Translation
Thi-Vinh Ngo
Thanh-Le Ha
Phuong-Thai Nguyen
Le-Minh Nguyen
24
8
0
18 May 2018
Cross-Target Stance Classification with Self-Attention Networks
Cross-Target Stance Classification with Self-Attention Networks
Chang Xu
Cécile Paris
Surya Nepal
R. Sparks
OOD
20
128
0
17 May 2018
Towards Robust Neural Machine Translation
Towards Robust Neural Machine Translation
Yong Cheng
Zhaopeng Tu
Fandong Meng
Junjie Zhai
Yang Liu
AAML
25
161
0
16 May 2018
RETURNN as a Generic Flexible Neural Toolkit with Application to
  Translation and Speech Recognition
RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition
Albert Zeyer
Tamer Alkhouli
Hermann Ney
37
90
0
14 May 2018
Bag-of-Words as Target for Neural Machine Translation
Bag-of-Words as Target for Neural Machine Translation
Shuming Ma
Xu Sun
Yizhong Wang
Junyang Lin
3DV
16
76
0
13 May 2018
Hierarchical Neural Story Generation
Hierarchical Neural Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
DiffM
60
1,594
0
13 May 2018
Jointly Predicting Predicates and Arguments in Neural Semantic Role
  Labeling
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
Luheng He
Kenton Lee
Omer Levy
Luke Zettlemoyer
19
188
0
12 May 2018
Previous
123...377378379380381
Next