ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 19,017 papers shown
Title
Deep Nets: What have they ever done for Vision?
Deep Nets: What have they ever done for Vision?
Alan Yuille
Chenxi Liu
28
102
0
10 May 2018
Global Encoding for Abstractive Summarization
Global Encoding for Abstractive Summarization
Junyang Lin
Xu Sun
Shuming Ma
Qi Su
21
146
0
10 May 2018
Neural Machine Translation Decoding with Terminology Constraints
Neural Machine Translation Decoding with Terminology Constraints
Eva Hasler
Adria de Gispert
Gonzalo Iglesias
Bill Byrne
AI4CE
26
108
0
09 May 2018
Learning representations for multivariate time series with missing data
  using Temporal Kernelized Autoencoders
Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders
F. Bianchi
L. Livi
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
AI4TS
30
11
0
09 May 2018
Reasoning with Sarcasm by Reading In-between
Reasoning with Sarcasm by Reading In-between
Yi Tay
Anh Tuan Luu
S. Hui
Jian Su
LRM
32
172
0
08 May 2018
Weakly-Supervised Video Object Grounding from Text by Loss Weighting and
  Object Interaction
Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction
Luowei Zhou
Nathan Louis
Jason J. Corso
39
94
0
08 May 2018
Multi-Domain Neural Machine Translation
Multi-Domain Neural Machine Translation
Sander Tars
Mark Fishel
AI4CE
17
50
0
06 May 2018
Transformer for Emotion Recognition
Transformer for Emotion Recognition
Jean-Benoit Delbrouck
20
1
0
03 May 2018
Facial Landmarks Localization using Cascaded Neural Networks
Facial Landmarks Localization using Cascaded Neural Networks
Shahar Mahpod
Rig Das
E. Maiorana
Y. Keller
P. Campisi
CVBM
21
18
0
03 May 2018
Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge
Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge
Ziqi Zheng
Chenjie Cao
Xingwei Chen
Guoqiang Xu
38
19
0
03 May 2018
Constituency Parsing with a Self-Attentive Encoder
Constituency Parsing with a Self-Attentive Encoder
Nikita Kitaev
Dan Klein
30
535
0
02 May 2018
Tensorized Self-Attention: Efficiently Modeling Pairwise and Global
  Dependencies Together
Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Chengqi Zhang
45
14
0
02 May 2018
Accelerating Neural Transformer via an Average Attention Network
Accelerating Neural Transformer via an Average Attention Network
Biao Zhang
Deyi Xiong
Jinsong Su
27
120
0
02 May 2018
Multi-representation Ensembles and Delayed SGD Updates Improve
  Syntax-based NMT
Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT
Danielle Saunders
Felix Stahlberg
Adria de Gispert
Bill Byrne
27
25
0
01 May 2018
Dynamic Sentence Sampling for Efficient Training of Neural Machine
  Translation
Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation
Rui Wang
Masao Utiyama
Eiichiro Sumita
42
27
0
01 May 2018
Subword Regularization: Improving Neural Network Translation Models with
  Multiple Subword Candidates
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Taku Kudo
60
1,149
0
29 Apr 2018
Improving Entity Linking by Modeling Latent Relations between Mentions
Improving Entity Linking by Modeling Latent Relations between Mentions
Phong Le
Ivan Titov
KELM
24
201
0
27 Apr 2018
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Seq2Seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models
Hendrik Strobelt
Sebastian Gehrmann
M. Behrisch
Adam Perer
Hanspeter Pfister
Alexander M. Rush
VLM
HAI
31
239
0
25 Apr 2018
Estimate and Replace: A Novel Approach to Integrating Deep Neural
  Networks with Existing Applications
Estimate and Replace: A Novel Approach to Integrating Deep Neural Networks with Existing Applications
Guy Hadash
Einat Kermany
Boaz Carmeli
Ofer Lavi
George Kour
Alon Jacovi
AI4TS
27
42
0
24 Apr 2018
QANet: Combining Local Convolution with Global Self-Attention for
  Reading Comprehension
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Adams Wei Yu
David Dohan
Minh-Thang Luong
Rui Zhao
Kai Chen
Mohammad Norouzi
Quoc V. Le
RALM
AIMat
35
1,092
0
23 Apr 2018
A neural interlingua for multilingual machine translation
A neural interlingua for multilingual machine translation
Y. Lu
Phillip Keung
Faisal Ladhak
Vikas Bhardwaj
Shaonan Zhang
Jason Sun
AI4CE
34
125
0
23 Apr 2018
Same Representation, Different Attentions: Shareable Sentence
  Representation Learning from Multiple Tasks
Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks
Renjie Zheng
Junkun Chen
Xipeng Qiu
40
30
0
22 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
7,005
0
20 Apr 2018
Fast Lexically Constrained Decoding with Dynamic Beam Allocation for
  Neural Machine Translation
Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
Matt Post
David Vilar
33
311
0
18 Apr 2018
Investigating Backtranslation in Neural Machine Translation
Investigating Backtranslation in Neural Machine Translation
Alberto Poncelas
D. Shterionov
Andy Way
Gideon Maillette de Buy Wenniger
Peyman Passban
33
143
0
17 Apr 2018
Approaching Neural Grammatical Error Correction as a Low-Resource
  Machine Translation Task
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task
Marcin Junczys-Dowmunt
Roman Grundkiewicz
Shubha Guha
Kenneth Heafield
33
192
0
16 Apr 2018
A Discourse-Aware Attention Model for Abstractive Summarization of Long
  Documents
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
Arman Cohan
Franck Dernoncourt
Doo Soon Kim
Trung Bui
Seokhwan Kim
W. Chang
Nazli Goharian
51
744
0
16 Apr 2018
Interact and Decide: Medley of Sub-Attention Networks for Effective
  Group Recommendation
Interact and Decide: Medley of Sub-Attention Networks for Effective Group Recommendation
Lucas Vinh Tran
T. Pham
Yi Tay
Yiding Liu
Gao Cong
Xiaoli Li
27
93
0
12 Apr 2018
Coloring with Words: Guiding Image Colorization Through Text-based
  Palette Generation
Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation
Hyojin Bahng
Seungjoo Yoo
Wonwoong Cho
D. Park
Ziming Wu
Xiaojuan Ma
Jaegul Choo
VLM
30
86
0
11 Apr 2018
Attention U-Net: Learning Where to Look for the Pancreas
Attention U-Net: Learning Where to Look for the Pancreas
Ozan Oktay
Jo Schlemper
Loic Le Folgoc
M. J. Lee
M. Heinrich
...
Jingyu Sun
Nils Y. Hammerla
Bernhard Kainz
Ben Glocker
Daniel Rueckert
SSeg
39
4,955
0
11 Apr 2018
Gotta Learn Fast: A New Benchmark for Generalization in RL
Gotta Learn Fast: A New Benchmark for Generalization in RL
Alex Nichol
Vicki Pfau
Christopher Hesse
Oleg Klimov
John Schulman
VLM
OffRL
15
177
0
10 Apr 2018
Real-Time Prediction of the Duration of Distribution System Outages
Real-Time Prediction of the Duration of Distribution System Outages
Aaron Jaech
Baosen Zhang
Mari Ostendorf
D. Kirschen
16
74
0
03 Apr 2018
Probing Physics Knowledge Using Tools from Developmental Psychology
Probing Physics Knowledge Using Tools from Developmental Psychology
Luis S. Piloto
Ari Weinstein
TB Dhruva
Arun Ahuja
M. Berk Mirza
Greg Wayne
David Amos
Chia-Chun Hung
M. Botvinick
30
34
0
03 Apr 2018
Improved Fusion of Visual and Language Representations by Dense
  Symmetric Co-Attention for Visual Question Answering
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Duy-Kien Nguyen
Takayuki Okatani
30
279
0
03 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
46
1,481
0
30 Mar 2018
Universal Sentence Encoder
Universal Sentence Encoder
Daniel Cer
Yinfei Yang
Sheng-yi Kong
Nan Hua
Nicole Limtiaco
...
Steve Yuan
Chris Tar
Yun-hsuan Sung
B. Strope
R. Kurzweil
59
1,891
0
29 Mar 2018
World Models
World Models
David R Ha
Jürgen Schmidhuber
SyDa
50
1,036
0
27 Mar 2018
Learning the Multiple Traveling Salesmen Problem with Permutation
  Invariant Pooling Networks
Learning the Multiple Traveling Salesmen Problem with Permutation Invariant Pooling Networks
Yoav Kaempfer
Lior Wolf
27
71
0
26 Mar 2018
Self-Attentional Acoustic Models
Self-Attentional Acoustic Models
Matthias Sperber
Jan Niehues
Graham Neubig
Sebastian Stüker
A. Waibel
22
151
0
26 Mar 2018
code2vec: Learning Distributed Representations of Code
code2vec: Learning Distributed Representations of Code
Uri Alon
Meital Zilberstein
Omer Levy
Eran Yahav
34
1,157
0
26 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in
  End-to-End Speech Synthesis
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
38
815
0
23 Mar 2018
Attention, Learn to Solve Routing Problems!
Attention, Learn to Solve Routing Problems!
W. Kool
H. V. Hoof
Max Welling
28
1,180
0
22 Mar 2018
AllenNLP: A Deep Semantic Natural Language Processing Platform
AllenNLP: A Deep Semantic Natural Language Processing Platform
Matt Gardner
Joel Grus
Mark Neumann
Oyvind Tafjord
Pradeep Dasigi
Nelson F. Liu
Matthew E. Peters
Michael Schmitz
Luke Zettlemoyer
VLM
28
1,277
0
20 Mar 2018
Why not be Versatile? Applications of the SGNMT Decoder for Machine
  Translation
Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation
Felix Stahlberg
Danielle Saunders
Gonzalo Iglesias
Bill Byrne
36
11
0
20 Mar 2018
Learning Region Features for Object Detection
Learning Region Features for Object Detection
Jiayuan Gu
Han Hu
Liwei Wang
Yichen Wei
Jifeng Dai
ObjD
32
78
0
19 Mar 2018
Tensor2Tensor for Neural Machine Translation
Tensor2Tensor for Neural Machine Translation
Ashish Vaswani
Samy Bengio
E. Brevdo
François Chollet
Aidan Gomez
...
Nal Kalchbrenner
Niki Parmar
Ryan Sepassi
Noam M. Shazeer
Jakob Uszkoreit
60
528
0
16 Mar 2018
TBD: Benchmarking and Analyzing Deep Neural Network Training
TBD: Benchmarking and Analyzing Deep Neural Network Training
Hongyu Zhu
Mohamed Akrout
Bojian Zheng
Andrew Pelegris
Amar Phanishayee
Bianca Schroeder
Gennady Pekhimenko
31
80
0
16 Mar 2018
Compositional Attention Networks for Machine Reasoning
Compositional Attention Networks for Machine Reasoning
Drew A. Hudson
Christopher D. Manning
BDL
OOD
LRM
32
573
0
08 Mar 2018
Generating Contradictory, Neutral, and Entailing Sentences
Generating Contradictory, Neutral, and Entailing Sentences
Songlin Yang
Shawn Tan
Chin-Wei Huang
Aaron Courville
19
3
0
07 Mar 2018
Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with
  Adversarial Examples
Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples
Minhao Cheng
Jinfeng Yi
Pin-Yu Chen
Huan Zhang
Cho-Jui Hsieh
SILM
AAML
54
242
0
03 Mar 2018
Previous
123...378379380381
Next