ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 18,591 papers shown
Title
Straight to the Tree: Constituency Parsing with Neural Syntactic
  Distance
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
Songlin Yang
Zhouhan Lin
Athul Paul Jacob
Alessandro Sordoni
Aaron Courville
Yoshua Bengio
25
91
0
11 Jun 2018
Findings of the Second Workshop on Neural Machine Translation and
  Generation
Findings of the Second Workshop on Neural Machine Translation and Generation
Alexandra Birch
A. Finch
Minh-Thang Luong
Graham Neubig
Yusuke Oda
DRL
33
12
0
08 Jun 2018
A Simple Method for Commonsense Reasoning
A Simple Method for Commonsense Reasoning
Trieu H. Trinh
Quoc V. Le
LRM
ReLM
31
432
0
07 Jun 2018
Relational Deep Reinforcement Learning
Relational Deep Reinforcement Learning
V. Zambaldi
David Raposo
Adam Santoro
V. Bapst
Yujia Li
...
Victoria Langston
Razvan Pascanu
M. Botvinick
Oriol Vinyals
Peter W. Battaglia
OffRL
24
219
0
05 Jun 2018
Relational recurrent neural networks
Relational recurrent neural networks
Adam Santoro
Ryan Faulkner
David Raposo
Jack W. Rae
Mike Chrzanowski
T. Weber
Daan Wierstra
Oriol Vinyals
Razvan Pascanu
Timothy Lillicrap
GNN
30
209
0
05 Jun 2018
Videos as Space-Time Region Graphs
Videos as Space-Time Region Graphs
Xinyu Wang
Abhinav Gupta
36
752
0
05 Jun 2018
Relational inductive biases, deep learning, and graph networks
Relational inductive biases, deep learning, and graph networks
Peter W. Battaglia
Jessica B. Hamrick
V. Bapst
Alvaro Sanchez-Gonzalez
V. Zambaldi
...
Pushmeet Kohli
M. Botvinick
Oriol Vinyals
Yujia Li
Razvan Pascanu
AI4CE
NAI
121
3,087
0
04 Jun 2018
On the Importance of Attention in Meta-Learning for Few-Shot Text
  Classification
On the Importance of Attention in Meta-Learning for Few-Shot Text Classification
Xiang Jiang
Mohammad Havaei
Gabriel Chartrand
Hassan Chouaib
Thomas Vincent
Andrew Jesson
Nicolas Chapados
Stan Matwin
VLM
14
18
0
03 Jun 2018
Scaling Neural Machine Translation
Scaling Neural Machine Translation
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
71
610
0
01 Jun 2018
Explaining Explanations: An Overview of Interpretability of Machine
  Learning
Explaining Explanations: An Overview of Interpretability of Machine Learning
Leilani H. Gilpin
David Bau
Ben Z. Yuan
Ayesha Bajwa
Michael A. Specter
Lalana Kagal
XAI
40
1,842
0
31 May 2018
On the Impact of Various Types of Noise on Neural Machine Translation
On the Impact of Various Types of Noise on Neural Machine Translation
Huda Khayrallah
Philipp Koehn
AAML
34
218
0
31 May 2018
Theory and Experiments on Vector Quantized Autoencoders
Theory and Experiments on Vector Quantized Autoencoders
Aurko Roy
Ashish Vaswani
Arvind Neelakantan
Niki Parmar
16
85
0
28 May 2018
Lipschitz regularity of deep neural networks: analysis and efficient
  estimation
Lipschitz regularity of deep neural networks: analysis and efficient estimation
Kevin Scaman
Aladin Virmaux
40
518
0
28 May 2018
A Stochastic Decoder for Neural Machine Translation
A Stochastic Decoder for Neural Machine Translation
P. Schulz
Wilker Aziz
Trevor Cohn
BDL
30
29
0
28 May 2018
OpenNMT: Neural Machine Translation Toolkit
OpenNMT: Neural Machine Translation Toolkit
Guillaume Klein
Yoon Kim
Yuntian Deng
Vincent Nguyen
Jean Senellart
Alexander M. Rush
144
119
0
28 May 2018
Convolutional neural networks for chemical-disease relation extraction
  are improved with character-based word embeddings
Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings
Dat Quoc Nguyen
Karin Verspoor
NAI
MedIm
16
46
0
27 May 2018
Zero-Shot Dual Machine Translation
Zero-Shot Dual Machine Translation
L. Sestorain
Massimiliano Ciaramita
Christian Buck
Thomas Hofmann
39
23
0
25 May 2018
Parallel Architecture and Hyperparameter Search via Successive Halving
  and Classification
Parallel Architecture and Hyperparameter Search via Successive Halving and Classification
Manoj Kumar
George E. Dahl
Vijay Vasudevan
Mohammad Norouzi
28
25
0
25 May 2018
Phrase Table as Recommendation Memory for Neural Machine Translation
Phrase Table as Recommendation Memory for Neural Machine Translation
Yang Zhao
Yining Wang
Jiajun Zhang
Chengqing Zong
42
30
0
25 May 2018
TADAM: Task dependent adaptive metric for improved few-shot learning
TADAM: Task dependent adaptive metric for improved few-shot learning
Boris N. Oreshkin
Pau Rodríguez López
Alexandre Lacoste
56
1,303
0
23 May 2018
AffinityNet: semi-supervised few-shot learning for disease type
  prediction
AffinityNet: semi-supervised few-shot learning for disease type prediction
Tianle Ma
A. Zhang
24
55
0
22 May 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN
  Training
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
27
44
0
22 May 2018
Sparse and Constrained Attention for Neural Machine Translation
Sparse and Constrained Attention for Neural Machine Translation
Chaitanya Malaviya
Pedro Ferreira
André F. T. Martins
22
62
0
21 May 2018
Global-Locally Self-Attentive Dialogue State Tracker
Global-Locally Self-Attentive Dialogue State Tracker
Victor Zhong
Caiming Xiong
R. Socher
13
188
0
19 May 2018
Combining Advanced Methods in Japanese-Vietnamese Neural Machine
  Translation
Combining Advanced Methods in Japanese-Vietnamese Neural Machine Translation
Thi-Vinh Ngo
Thanh-Le Ha
Phuong-Thai Nguyen
Le-Minh Nguyen
24
8
0
18 May 2018
Cross-Target Stance Classification with Self-Attention Networks
Cross-Target Stance Classification with Self-Attention Networks
Chang Xu
Cécile Paris
Surya Nepal
R. Sparks
OOD
20
128
0
17 May 2018
Towards Robust Neural Machine Translation
Towards Robust Neural Machine Translation
Yong Cheng
Zhaopeng Tu
Fandong Meng
Junjie Zhai
Yang Liu
AAML
25
161
0
16 May 2018
RETURNN as a Generic Flexible Neural Toolkit with Application to
  Translation and Speech Recognition
RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition
Albert Zeyer
Tamer Alkhouli
Hermann Ney
37
90
0
14 May 2018
Bag-of-Words as Target for Neural Machine Translation
Bag-of-Words as Target for Neural Machine Translation
Shuming Ma
Xu Sun
Yizhong Wang
Junyang Lin
3DV
16
76
0
13 May 2018
Hierarchical Neural Story Generation
Hierarchical Neural Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
DiffM
60
1,592
0
13 May 2018
Jointly Predicting Predicates and Arguments in Neural Semantic Role
  Labeling
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
Luheng He
Kenton Lee
Omer Levy
Luke Zettlemoyer
19
188
0
12 May 2018
Deep Nets: What have they ever done for Vision?
Deep Nets: What have they ever done for Vision?
Alan Yuille
Chenxi Liu
28
100
0
10 May 2018
Global Encoding for Abstractive Summarization
Global Encoding for Abstractive Summarization
Junyang Lin
Xu Sun
Shuming Ma
Qi Su
21
146
0
10 May 2018
Neural Machine Translation Decoding with Terminology Constraints
Neural Machine Translation Decoding with Terminology Constraints
Eva Hasler
Adria de Gispert
Gonzalo Iglesias
Bill Byrne
AI4CE
26
108
0
09 May 2018
Learning representations for multivariate time series with missing data
  using Temporal Kernelized Autoencoders
Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders
F. Bianchi
L. Livi
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
AI4TS
30
11
0
09 May 2018
Reasoning with Sarcasm by Reading In-between
Reasoning with Sarcasm by Reading In-between
Yi Tay
Anh Tuan Luu
S. Hui
Jian Su
LRM
32
172
0
08 May 2018
Transformer for Emotion Recognition
Transformer for Emotion Recognition
Jean-Benoit Delbrouck
20
1
0
03 May 2018
Facial Landmarks Localization using Cascaded Neural Networks
Facial Landmarks Localization using Cascaded Neural Networks
Shahar Mahpod
Rig Das
E. Maiorana
Y. Keller
P. Campisi
CVBM
19
18
0
03 May 2018
Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge
Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge
Ziqi Zheng
Chenjie Cao
Xingwei Chen
Guoqiang Xu
38
19
0
03 May 2018
Constituency Parsing with a Self-Attentive Encoder
Constituency Parsing with a Self-Attentive Encoder
Nikita Kitaev
Dan Klein
30
535
0
02 May 2018
Tensorized Self-Attention: Efficiently Modeling Pairwise and Global
  Dependencies Together
Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Chengqi Zhang
43
14
0
02 May 2018
Accelerating Neural Transformer via an Average Attention Network
Accelerating Neural Transformer via an Average Attention Network
Biao Zhang
Deyi Xiong
Jinsong Su
27
120
0
02 May 2018
Multi-representation Ensembles and Delayed SGD Updates Improve
  Syntax-based NMT
Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT
Danielle Saunders
Felix Stahlberg
Adria de Gispert
Bill Byrne
27
25
0
01 May 2018
Dynamic Sentence Sampling for Efficient Training of Neural Machine
  Translation
Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation
Rui Wang
Masao Utiyama
Eiichiro Sumita
42
27
0
01 May 2018
Subword Regularization: Improving Neural Network Translation Models with
  Multiple Subword Candidates
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Taku Kudo
51
1,147
0
29 Apr 2018
Improving Entity Linking by Modeling Latent Relations between Mentions
Improving Entity Linking by Modeling Latent Relations between Mentions
Phong Le
Ivan Titov
KELM
19
201
0
27 Apr 2018
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Hendrik Strobelt
Sebastian Gehrmann
M. Behrisch
Adam Perer
Hanspeter Pfister
Alexander M. Rush
VLM
HAI
31
239
0
25 Apr 2018
Estimate and Replace: A Novel Approach to Integrating Deep Neural
  Networks with Existing Applications
Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks with Existing Applications
Guy Hadash
Einat Kermany
Boaz Carmeli
Ofer Lavi
George Kour
Alon Jacovi
AI4TS
27
42
0
24 Apr 2018
QANet: Combining Local Convolution with Global Self-Attention for
  Reading Comprehension
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Adams Wei Yu
David Dohan
Minh-Thang Luong
Rui Zhao
Kai Chen
Mohammad Norouzi
Quoc V. Le
RALM
AIMat
35
1,092
0
23 Apr 2018
A neural interlingua for multilingual machine translation
A neural interlingua for multilingual machine translation
Y. Lu
Phillip Keung
Faisal Ladhak
Vikas Bhardwaj
Shaonan Zhang
Jason Sun
AI4CE
34
125
0
23 Apr 2018
Previous
123...369370371372
Next