ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 19,538 papers shown
Title
Gotta Learn Fast: A New Benchmark for Generalization in RL
Gotta Learn Fast: A New Benchmark for Generalization in RL
Alex Nichol
Vicki Pfau
Christopher Hesse
Oleg Klimov
John Schulman
VLM
OffRL
15
177
0
10 Apr 2018
Real-Time Prediction of the Duration of Distribution System Outages
Real-Time Prediction of the Duration of Distribution System Outages
Aaron Jaech
Baosen Zhang
Mari Ostendorf
D. Kirschen
18
74
0
03 Apr 2018
Probing Physics Knowledge Using Tools from Developmental Psychology
Probing Physics Knowledge Using Tools from Developmental Psychology
Luis S. Piloto
Ari Weinstein
TB Dhruva
Arun Ahuja
M. Berk Mirza
Greg Wayne
David Amos
Chia-Chun Hung
M. Botvinick
30
34
0
03 Apr 2018
Improved Fusion of Visual and Language Representations by Dense
  Symmetric Co-Attention for Visual Question Answering
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Duy-Kien Nguyen
Takayuki Okatani
30
279
0
03 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
46
1,484
0
30 Mar 2018
Universal Sentence Encoder
Universal Sentence Encoder
Daniel Cer
Yinfei Yang
Sheng-yi Kong
Nan Hua
Nicole Limtiaco
...
Steve Yuan
Chris Tar
Yun-hsuan Sung
B. Strope
R. Kurzweil
61
1,891
0
29 Mar 2018
World Models
World Models
David R Ha
Jürgen Schmidhuber
SyDa
53
1,036
0
27 Mar 2018
Learning the Multiple Traveling Salesmen Problem with Permutation
  Invariant Pooling Networks
Learning the Multiple Traveling Salesmen Problem with Permutation Invariant Pooling Networks
Yoav Kaempfer
Lior Wolf
27
71
0
26 Mar 2018
Self-Attentional Acoustic Models
Self-Attentional Acoustic Models
Matthias Sperber
Jan Niehues
Graham Neubig
Sebastian Stüker
A. Waibel
22
151
0
26 Mar 2018
code2vec: Learning Distributed Representations of Code
code2vec: Learning Distributed Representations of Code
Uri Alon
Meital Zilberstein
Omer Levy
Eran Yahav
34
1,157
0
26 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in
  End-to-End Speech Synthesis
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
41
815
0
23 Mar 2018
Attention, Learn to Solve Routing Problems!
Attention, Learn to Solve Routing Problems!
W. Kool
H. V. Hoof
Max Welling
28
1,182
0
22 Mar 2018
AllenNLP: A Deep Semantic Natural Language Processing Platform
AllenNLP: A Deep Semantic Natural Language Processing Platform
Matt Gardner
Joel Grus
Mark Neumann
Oyvind Tafjord
Pradeep Dasigi
Nelson F. Liu
Matthew E. Peters
Michael Schmitz
Luke Zettlemoyer
VLM
28
1,276
0
20 Mar 2018
Why not be Versatile? Applications of the SGNMT Decoder for Machine
  Translation
Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation
Felix Stahlberg
Danielle Saunders
Gonzalo Iglesias
Bill Byrne
36
11
0
20 Mar 2018
Learning Region Features for Object Detection
Learning Region Features for Object Detection
Jiayuan Gu
Han Hu
Liwei Wang
Yichen Wei
Jifeng Dai
ObjD
34
78
0
19 Mar 2018
Tensor2Tensor for Neural Machine Translation
Tensor2Tensor for Neural Machine Translation
Ashish Vaswani
Samy Bengio
E. Brevdo
François Chollet
Aidan Gomez
...
Nal Kalchbrenner
Niki Parmar
Ryan Sepassi
Noam M. Shazeer
Jakob Uszkoreit
60
528
0
16 Mar 2018
TBD: Benchmarking and Analyzing Deep Neural Network Training
TBD: Benchmarking and Analyzing Deep Neural Network Training
Hongyu Zhu
Mohamed Akrout
Bojian Zheng
Andrew Pelegris
Amar Phanishayee
Bianca Schroeder
Gennady Pekhimenko
31
80
0
16 Mar 2018
Recurrent Neural Network Attention Mechanisms for Interpretable System
  Log Anomaly Detection
Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection
Andy Brown
Aaron Tuor
Brian Hutchinson
Nicole Nichols
19
173
0
13 Mar 2018
Compositional Attention Networks for Machine Reasoning
Compositional Attention Networks for Machine Reasoning
Drew A. Hudson
Christopher D. Manning
BDL
OOD
LRM
32
573
0
08 Mar 2018
Generating Contradictory, Neutral, and Entailing Sentences
Generating Contradictory, Neutral, and Entailing Sentences
Songlin Yang
Shawn Tan
Chin-Wei Huang
Aaron Courville
19
3
0
07 Mar 2018
Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with
  Adversarial Examples
Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples
Minhao Cheng
Jinfeng Yi
Pin-Yu Chen
Huan Zhang
Cho-Jui Hsieh
SILM
AAML
54
242
0
03 Mar 2018
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Trieu H. Trinh
Andrew M. Dai
Thang Luong
Quoc V. Le
43
179
0
01 Mar 2018
Analyzing Uncertainty in Neural Machine Translation
Analyzing Uncertainty in Neural Machine Translation
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
UQLM
43
271
0
28 Feb 2018
Pop Music Highlighter: Marking the Emotion Keypoints
Pop Music Highlighter: Marking the Emotion Keypoints
Yu-Siang Huang
Szu-Yu Chou
Yi-Hsuan Yang
23
17
0
28 Feb 2018
Shampoo: Preconditioned Stochastic Tensor Optimization
Shampoo: Preconditioned Stochastic Tensor Optimization
Vineet Gupta
Tomer Koren
Y. Singer
ODL
34
205
0
26 Feb 2018
Efficient Neural Audio Synthesis
Efficient Neural Audio Synthesis
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
50
864
0
23 Feb 2018
Attentive Tensor Product Learning
Attentive Tensor Product Learning
Qiuyuan Huang
Li Deng
D. Wu
Chang Liu
Xiaodong He
27
23
0
20 Feb 2018
Fitting New Speakers Based on a Short Untranscribed Sample
Fitting New Speakers Based on a Short Untranscribed Sample
Eliya Nachmani
Adam Polyak
Yaniv Taigman
Lior Wolf
24
84
0
20 Feb 2018
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative
  Refinement
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement
Jason D. Lee
Elman Mansimov
Kyunghyun Cho
DiffM
BDL
42
455
0
19 Feb 2018
Global Pose Estimation with an Attention-based Recurrent Network
Global Pose Estimation with an Attention-based Recurrent Network
Emilio Parisotto
Devendra Singh Chaplot
Jian Zhang
Ruslan Salakhutdinov
26
70
0
19 Feb 2018
Universal Neural Machine Translation for Extremely Low Resource
  Languages
Universal Neural Machine Translation for Extremely Low Resource Languages
Jiatao Gu
Hany Hassan
Jacob Devlin
V. Li
35
275
0
15 Feb 2018
Multimodal Generative Models for Scalable Weakly-Supervised Learning
Multimodal Generative Models for Scalable Weakly-Supervised Learning
Mike Wu
Noah D. Goodman
DRL
39
378
0
14 Feb 2018
$\mathcal{G}$-SGD: Optimizing ReLU Neural Networks in its Positively
  Scale-Invariant Space
G\mathcal{G}G-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space
Qi Meng
Shuxin Zheng
Huishuai Zhang
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
35
38
0
11 Feb 2018
Tree-to-tree Neural Networks for Program Translation
Tree-to-tree Neural Networks for Program Translation
Xinyun Chen
Chang-rui Liu
D. Song
18
275
0
11 Feb 2018
On the Universal Approximability and Complexity Bounds of Quantized ReLU
  Neural Networks
On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks
Yukun Ding
Jinglan Liu
Jinjun Xiong
Yiyu Shi
MQ
37
21
0
10 Feb 2018
Recurrent Neural Network-Based Semantic Variational Autoencoder for
  Sequence-to-Sequence Learning
Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning
Myeongjun Jang
Seungwan Seo
Pilsung Kang
DRL
51
55
0
09 Feb 2018
Zero-Resource Neural Machine Translation with Multi-Agent Communication
  Game
Zero-Resource Neural Machine Translation with Multi-Agent Communication Game
Yun Chen
Yang Liu
V. Li
41
47
0
09 Feb 2018
Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention
  for Sequence Modeling
Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Sen Wang
Chengqi Zhang
AI4TS
50
144
0
31 Jan 2018
Multi-Pointer Co-Attention Networks for Recommendation
Multi-Pointer Co-Attention Networks for Recommendation
Yi Tay
Anh Tuan Luu
S. Hui
3DV
29
287
0
28 Jan 2018
Context Models for OOV Word Translation in Low-Resource Languages
Context Models for OOV Word Translation in Low-Resource Languages
Angli Liu
Katrin Kirchhoff
29
9
0
26 Jan 2018
MaskGAN: Better Text Generation via Filling in the______
MaskGAN: Better Text Generation via Filling in the______
W. Fedus
Ian Goodfellow
Andrew M. Dai
24
468
0
23 Jan 2018
Fix your classifier: the marginal value of training the last weight
  layer
Fix your classifier: the marginal value of training the last weight layer
Elad Hoffer
Itay Hubara
Daniel Soudry
35
101
0
14 Jan 2018
Distance-based Self-Attention Network for Natural Language Inference
Distance-based Self-Attention Network for Natural Language Inference
Jinbae Im
Sungzoon Cho
43
76
0
06 Dec 2017
Strong Baselines for Simple Question Answering over Knowledge Graphs
  with and without Neural Networks
Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks
Salman Mohammed
Peng Shi
Jimmy J. Lin
31
105
0
05 Dec 2017
Improving the Performance of Online Neural Transducer Models
Improving the Performance of Online Neural Transducer Models
Tara N. Sainath
Chung-Cheng Chiu
Rohit Prabhavalkar
Anjuli Kannan
Yonghui Wu
Patrick Nguyen
Zhehuai Chen
AI4TS
41
49
0
05 Dec 2017
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
...
Katya Gonina
Navdeep Jaitly
Yue Liu
J. Chorowski
M. Bacchiani
AI4TS
54
1,149
0
05 Dec 2017
Deep Semantic Role Labeling with Self-Attention
Deep Semantic Role Labeling with Self-Attention
Zhixing Tan
Mingxuan Wang
Jun Xie
Yidong Chen
X. Shi
33
308
0
05 Dec 2017
SkipNet: Learning Dynamic Routing in Convolutional Networks
SkipNet: Learning Dynamic Routing in Convolutional Networks
Xin Wang
Feng Yu
Zi-Yi Dou
Trevor Darrell
Joseph E. Gonzalez
39
626
0
26 Nov 2017
Convolutional Image Captioning
Convolutional Image Captioning
J. Aneja
Aditya Deshpande
Alex Schwing
VLM
37
360
0
24 Nov 2017
Speech recognition for medical conversations
Speech recognition for medical conversations
Chung-Cheng Chiu
Anshuman Tripathi
Katherine Chou
Chris Co
Navdeep Jaitly
...
Ananth Sankar
Justin Tansuwan
Nathan Wan
Yonghui Wu
Xuedong Zhang
LM&MA
40
84
0
20 Nov 2017
Previous
123...389390391
Next