ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need
v1v2v3v4v5v6v7 (latest)

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXiv (abs)PDFHTML

Papers citing "Attention Is All You Need"

50 / 27,175 papers shown
Title
Minimum Word Error Rate Training for Attention-based
  Sequence-to-Sequence Models
Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models
Rohit Prabhavalkar
Tara N. Sainath
Yonghui Wu
Patrick Nguyen
Zhiwen Chen
Chung-Cheng Chiu
Anjuli Kannan
80
162
0
05 Dec 2017
Improving the Performance of Online Neural Transducer Models
Improving the Performance of Online Neural Transducer Models
Tara N. Sainath
Chung-Cheng Chiu
Rohit Prabhavalkar
Anjuli Kannan
Yonghui Wu
Patrick Nguyen
Zhiwen Chen
AI4TS
97
49
0
05 Dec 2017
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
...
Katya Gonina
Navdeep Jaitly
Yue Liu
J. Chorowski
M. Bacchiani
AI4TS
127
1,155
0
05 Dec 2017
Deep Semantic Role Labeling with Self-Attention
Deep Semantic Role Labeling with Self-Attention
Zhixing Tan
Mingxuan Wang
Jun Xie
Yidong Chen
X. Shi
93
311
0
05 Dec 2017
Relation Networks for Object Detection
Relation Networks for Object Detection
Han Hu
Jiayuan Gu
Zheng Zhang
Jifeng Dai
Yichen Wei
ObjD
150
1,230
0
30 Nov 2017
AttnGAN: Fine-Grained Text to Image Generation with Attentional
  Generative Adversarial Networks
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GANViT
131
1,725
0
28 Nov 2017
Population Based Training of Neural Networks
Population Based Training of Neural Networks
Max Jaderberg
Valentin Dalibard
Simon Osindero
Wojciech M. Czarnecki
Jeff Donahue
...
Tim Green
Iain Dunning
Karen Simonyan
Chrisantha Fernando
Koray Kavukcuoglu
100
745
0
27 Nov 2017
Neural Text Generation: A Practical Guide
Neural Text Generation: A Practical Guide
Ziang Xie
57
46
0
27 Nov 2017
SkipNet: Learning Dynamic Routing in Convolutional Networks
SkipNet: Learning Dynamic Routing in Convolutional Networks
Xin Wang
Feng Yu
Zi-Yi Dou
Trevor Darrell
Joseph E. Gonzalez
150
640
0
26 Nov 2017
Convolutional Image Captioning
Convolutional Image Captioning
J. Aneja
Aditya Deshpande
Alex Schwing
VLM
137
361
0
24 Nov 2017
Non-local Neural Networks
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
366
8,940
0
21 Nov 2017
Speech recognition for medical conversations
Speech recognition for medical conversations
Chung-Cheng Chiu
Anshuman Tripathi
Katherine Chou
Chris Co
Navdeep Jaitly
...
Ananth Sankar
Justin Tansuwan
Nathan Wan
Yonghui Wu
Xuedong Zhang
LM&MA
72
84
0
20 Nov 2017
ATRank: An Attention-Based User Behavior Modeling Framework for
  Recommendation
ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation
Chang Zhou
Jinze Bai
Junshuai Song
Xiaofei Liu
Zhengchao Zhao
Xiusi Chen
Jun Gao
HAI
95
309
0
17 Nov 2017
Training Simplification and Model Simplification for Deep Learning: A
  Minimal Effort Back Propagation Method
Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method
Xu Sun
Xuancheng Ren
Shuming Ma
Bingzhen Wei
Wei Li
Jingjing Xu
Houfeng Wang
Yi Zhang
56
24
0
17 Nov 2017
Image Matters: Visually modeling user behaviors using Advanced Model
  Server
Image Matters: Visually modeling user behaviors using Advanced Model Server
T. Ge
Liqin Zhao
Guorui Zhou
Keyu Chen
Shuying Liu
...
Sui Huang
Qing Cui
Xiaoqiang Zhu
Yu Zhang
Kun Gai
83
41
0
17 Nov 2017
Attend and Interact: Higher-Order Object Interactions for Video
  Understanding
Attend and Interact: Higher-Order Object Interactions for Video Understanding
Chih-Yao Ma
Asim Kadav
I. Melvin
Z. Kira
G. Al-Regib
H. Graf
83
145
0
16 Nov 2017
FusionNet: Fusing via Fully-Aware Attention with Application to Machine
  Comprehension
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension
Hsin-Yuan Huang
Chenguang Zhu
Yelong Shen
Weizhu Chen
FedML
87
183
0
16 Nov 2017
Motif-based Convolutional Neural Network on Graphs
Motif-based Convolutional Neural Network on Graphs
Aravind Sankar
Xinyang Zhang
Kevin Chen-Chuan Chang
GNN
91
42
0
15 Nov 2017
Controllable Abstractive Summarization
Controllable Abstractive Summarization
Angela Fan
David Grangier
Michael Auli
103
312
0
14 Nov 2017
Classical Structured Prediction Losses for Sequence to Sequence Learning
Classical Structured Prediction Losses for Sequence to Sequence Learning
Sergey Edunov
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
AIMat
122
186
0
14 Nov 2017
QuickEdit: Editing Text & Translations by Crossing Words Out
QuickEdit: Editing Text & Translations by Crossing Words Out
David Grangier
Michael Auli
KELM
71
10
0
13 Nov 2017
Few-Shot Learning with Graph Neural Networks
Few-Shot Learning with Graph Neural Networks
Victor Garcia Satorras
Joan Bruna
GNN
182
1,241
0
10 Nov 2017
Attend and Diagnose: Clinical Time Series Analysis using Attention
  Models
Attend and Diagnose: Clinical Time Series Analysis using Attention Models
Huan-Zhi Song
Deepta Rajan
Jayaraman J. Thiagarajan
A. Spanias
MLAU
112
456
0
10 Nov 2017
Non-Autoregressive Neural Machine Translation
Non-Autoregressive Neural Machine Translation
Jiatao Gu
James Bradbury
Caiming Xiong
Victor O.K. Li
R. Socher
107
798
0
07 Nov 2017
Weighted Transformer Network for Machine Translation
Weighted Transformer Network for Machine Translation
Karim Ahmed
N. Keskar
R. Socher
84
134
0
06 Nov 2017
Attentional Pooling for Action Recognition
Attentional Pooling for Action Recognition
Rohit Girdhar
Deva Ramanan
135
321
0
04 Nov 2017
Fixing a Broken ELBO
Fixing a Broken ELBO
Alexander A. Alemi
Ben Poole
Ian S. Fischer
Joshua V. Dillon
Rif A. Saurous
Kevin Patrick Murphy
DRLBDL
101
80
0
01 Nov 2017
Paraphrase Generation with Deep Reinforcement Learning
Paraphrase Generation with Deep Reinforcement Learning
Zichao Li
Xin Jiang
Lifeng Shang
Hang Li
OffRL
124
214
0
01 Nov 2017
DCN+: Mixed Objective and Deep Residual Coattention for Question
  Answering
DCN+: Mixed Objective and Deep Residual Coattention for Question Answering
Caiming Xiong
Victor Zhong
R. Socher
96
109
0
31 Oct 2017
Graph Attention Networks
Graph Attention Networks
Petar Velickovic
Guillem Cucurull
Arantxa Casanova
Adriana Romero
Pietro Lio
Yoshua Bengio
GNN
538
20,351
0
30 Oct 2017
Phase Conductor on Multi-layered Attentions for Machine Comprehension
Phase Conductor on Multi-layered Attentions for Machine Comprehension
R. Liu
Wei Wei
Weiguang Mao
M. Chikina
92
22
0
28 Oct 2017
Attending to All Mention Pairs for Full Abstract Biological Relation
  Extraction
Attending to All Mention Pairs for Full Abstract Biological Relation Extraction
Pat Verga
Emma Strubell
O. Shai
Andrew McCallum
3DV
46
11
0
23 Oct 2017
ActivityNet Challenge 2017 Summary
ActivityNet Challenge 2017 Summary
Guohao Li
Juan Carlos Niebles
Cees G. M. Snoek
Fabian Caba Heilbron
Humam Alwassel
Ranjay Krishna
Victor Escorcia
Kenji Hata
S. Buch
110
48
0
22 Oct 2017
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence
  Learning
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
Ming-Yu Liu
Kainan Peng
Andrew Gibiansky
Sercan O. Arik
Ajay Kannan
Sharan Narang
Jonathan Raiman
John Miller
99
309
0
20 Oct 2017
Searching for Activation Functions
Searching for Activation Functions
Prajit Ramachandran
Barret Zoph
Quoc V. Le
97
612
0
16 Oct 2017
Social Attention: Modeling Attention in Human Crowds
Social Attention: Modeling Attention in Human Crowds
Anirudh Vemula
Katharina Muelling
Jean Oh
HAI
76
647
0
12 Oct 2017
Low-Rank RNN Adaptation for Context-Aware Language Modeling
Low-Rank RNN Adaptation for Context-Aware Language Modeling
Aaron Jaech
Mari Ostendorf
72
25
0
06 Oct 2017
Enhanced Neural Machine Translation by Learning from Draft
Enhanced Neural Machine Translation by Learning from Draft
Aodong Li
Shiyue Zhang
Dong Wang
Tianshi Zheng
AIMat
59
5
0
04 Oct 2017
Improving Lexical Choice in Neural Machine Translation
Improving Lexical Choice in Neural Machine Translation
Toan Q. Nguyen
David Chiang
88
86
0
03 Oct 2017
Attentive Convolution: Equipping CNNs with RNN-style Attention
  Mechanisms
Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms
Wenpeng Yin
Hinrich Schütze
85
42
0
02 Oct 2017
Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named
  Entity Recognition
Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition
L. T. Anh
M. Y. Arkhipov
M. Burtsev
29
37
0
27 Sep 2017
Generating Sentences by Editing Prototypes
Generating Sentences by Editing Prototypes
Kelvin Guu
Tatsunori B. Hashimoto
Yonatan Oren
Percy Liang
203
316
0
26 Sep 2017
The Consciousness Prior
The Consciousness Prior
Yoshua Bengio
DRLAI4CE
73
231
0
25 Sep 2017
Code Attention: Translating Code to Comments by Exploiting Domain
  Features
Code Attention: Translating Code to Comments by Exploiting Domain Features
Wenhao Zheng
Hong-Yu Zhou
Ming Li
Jianxin Wu
23
19
0
22 Sep 2017
Neural Networks for Text Correction and Completion in Keyboard Decoding
Neural Networks for Text Correction and Completion in Keyboard Decoding
Shaon Ghosh
Per Ola Kristensson
HAI
49
69
0
19 Sep 2017
Self-Attentive Residual Decoder for Neural Machine Translation
Self-Attentive Residual Decoder for Neural Machine Translation
Lesly Miculicich
Nikolaos Pappas
Dhananjay Ram
Andrei Popescu-Belis
52
20
0
14 Sep 2017
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language
  Understanding
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Shirui Pan
Chengqi Zhang
119
757
0
14 Sep 2017
Natural Language Inference over Interaction Space
Natural Language Inference over Interaction Space
Yichen Gong
Heng Luo
Jian Zhang
115
265
0
13 Sep 2017
Refining Source Representations with Relation Networks for Neural Machine Translation
Wen Zhang
Jiawei Hu
Yang Feng
Qun Liu
41
7
0
12 Sep 2017
Simple Recurrent Units for Highly Parallelizable Recurrence
Simple Recurrent Units for Highly Parallelizable Recurrence
Tao Lei
Yu Zhang
Sida I. Wang
Huijing Dai
Yoav Artzi
LRM
163
277
0
08 Sep 2017
Previous
123...542543544
Next