ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need
v1v2v3v4v5v6v7 (latest)

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXiv (abs)PDFHTML

Papers citing "Attention Is All You Need"

50 / 26,904 papers shown
Title
Few-Shot Learning with Graph Neural Networks
Few-Shot Learning with Graph Neural Networks
Victor Garcia Satorras
Joan Bruna
GNN
182
1,241
0
10 Nov 2017
Attend and Diagnose: Clinical Time Series Analysis using Attention
  Models
Attend and Diagnose: Clinical Time Series Analysis using Attention Models
Huan-Zhi Song
Deepta Rajan
Jayaraman J. Thiagarajan
A. Spanias
MLAU
96
456
0
10 Nov 2017
Non-Autoregressive Neural Machine Translation
Non-Autoregressive Neural Machine Translation
Jiatao Gu
James Bradbury
Caiming Xiong
Victor O.K. Li
R. Socher
107
798
0
07 Nov 2017
Weighted Transformer Network for Machine Translation
Weighted Transformer Network for Machine Translation
Karim Ahmed
N. Keskar
R. Socher
84
134
0
06 Nov 2017
Attentional Pooling for Action Recognition
Attentional Pooling for Action Recognition
Rohit Girdhar
Deva Ramanan
135
321
0
04 Nov 2017
Fixing a Broken ELBO
Fixing a Broken ELBO
Alexander A. Alemi
Ben Poole
Ian S. Fischer
Joshua V. Dillon
Rif A. Saurous
Kevin Patrick Murphy
DRLBDL
101
80
0
01 Nov 2017
Paraphrase Generation with Deep Reinforcement Learning
Paraphrase Generation with Deep Reinforcement Learning
Zichao Li
Xin Jiang
Lifeng Shang
Hang Li
OffRL
119
214
0
01 Nov 2017
DCN+: Mixed Objective and Deep Residual Coattention for Question
  Answering
DCN+: Mixed Objective and Deep Residual Coattention for Question Answering
Caiming Xiong
Victor Zhong
R. Socher
96
109
0
31 Oct 2017
Graph Attention Networks
Graph Attention Networks
Petar Velickovic
Guillem Cucurull
Arantxa Casanova
Adriana Romero
Pietro Lio
Yoshua Bengio
GNN
495
20,343
0
30 Oct 2017
Phase Conductor on Multi-layered Attentions for Machine Comprehension
Phase Conductor on Multi-layered Attentions for Machine Comprehension
R. Liu
Wei Wei
Weiguang Mao
M. Chikina
92
22
0
28 Oct 2017
Attending to All Mention Pairs for Full Abstract Biological Relation
  Extraction
Attending to All Mention Pairs for Full Abstract Biological Relation Extraction
Pat Verga
Emma Strubell
O. Shai
Andrew McCallum
3DV
46
11
0
23 Oct 2017
ActivityNet Challenge 2017 Summary
ActivityNet Challenge 2017 Summary
Guohao Li
Juan Carlos Niebles
Cees G. M. Snoek
Fabian Caba Heilbron
Humam Alwassel
Ranjay Krishna
Victor Escorcia
Kenji Hata
S. Buch
105
48
0
22 Oct 2017
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence
  Learning
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
Ming-Yu Liu
Kainan Peng
Andrew Gibiansky
Sercan O. Arik
Ajay Kannan
Sharan Narang
Jonathan Raiman
John Miller
90
309
0
20 Oct 2017
Searching for Activation Functions
Searching for Activation Functions
Prajit Ramachandran
Barret Zoph
Quoc V. Le
97
612
0
16 Oct 2017
Social Attention: Modeling Attention in Human Crowds
Social Attention: Modeling Attention in Human Crowds
Anirudh Vemula
Katharina Muelling
Jean Oh
HAI
73
645
0
12 Oct 2017
Low-Rank RNN Adaptation for Context-Aware Language Modeling
Low-Rank RNN Adaptation for Context-Aware Language Modeling
Aaron Jaech
Mari Ostendorf
69
25
0
06 Oct 2017
Enhanced Neural Machine Translation by Learning from Draft
Enhanced Neural Machine Translation by Learning from Draft
Aodong Li
Shiyue Zhang
Dong Wang
Tianshi Zheng
AIMat
59
5
0
04 Oct 2017
Improving Lexical Choice in Neural Machine Translation
Improving Lexical Choice in Neural Machine Translation
Toan Q. Nguyen
David Chiang
84
86
0
03 Oct 2017
Attentive Convolution: Equipping CNNs with RNN-style Attention
  Mechanisms
Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms
Wenpeng Yin
Hinrich Schütze
85
42
0
02 Oct 2017
Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named
  Entity Recognition
Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition
L. T. Anh
M. Y. Arkhipov
M. Burtsev
29
37
0
27 Sep 2017
Generating Sentences by Editing Prototypes
Generating Sentences by Editing Prototypes
Kelvin Guu
Tatsunori B. Hashimoto
Yonatan Oren
Percy Liang
168
316
0
26 Sep 2017
The Consciousness Prior
The Consciousness Prior
Yoshua Bengio
DRLAI4CE
70
231
0
25 Sep 2017
Code Attention: Translating Code to Comments by Exploiting Domain
  Features
Code Attention: Translating Code to Comments by Exploiting Domain Features
Wenhao Zheng
Hong-Yu Zhou
Ming Li
Jianxin Wu
16
19
0
22 Sep 2017
Neural Networks for Text Correction and Completion in Keyboard Decoding
Neural Networks for Text Correction and Completion in Keyboard Decoding
Shaon Ghosh
Per Ola Kristensson
HAI
49
69
0
19 Sep 2017
Self-Attentive Residual Decoder for Neural Machine Translation
Self-Attentive Residual Decoder for Neural Machine Translation
Lesly Miculicich
Nikolaos Pappas
Dhananjay Ram
Andrei Popescu-Belis
52
20
0
14 Sep 2017
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language
  Understanding
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding
Tao Shen
Dinesh Manocha
Guodong Long
Jing Jiang
Shirui Pan
Chengqi Zhang
107
757
0
14 Sep 2017
Natural Language Inference over Interaction Space
Natural Language Inference over Interaction Space
Yichen Gong
Heng Luo
Jian Zhang
108
265
0
13 Sep 2017
Refining Source Representations with Relation Networks for Neural Machine Translation
Wen Zhang
Jiawei Hu
Yang Feng
Qun Liu
41
7
0
12 Sep 2017
Simple Recurrent Units for Highly Parallelizable Recurrence
Simple Recurrent Units for Highly Parallelizable Recurrence
Tao Lei
Yu Zhang
Sida I. Wang
Huijing Dai
Yoav Artzi
LRM
157
277
0
08 Sep 2017
Deep Learning Techniques for Music Generation -- A Survey
Deep Learning Techniques for Music Generation -- A Survey
Jean-Pierre Briot
Gaëtan Hadjeres
F. Pachet
MGen
150
301
0
05 Sep 2017
Squeeze-and-Excitation Networks
Squeeze-and-Excitation Networks
Jie Hu
Li Shen
Samuel Albanie
Gang Sun
Enhua Wu
431
26,653
0
05 Sep 2017
Natural Language Processing: State of The Art, Current Trends and
  Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
Diksha Khurana
Aditya Koli
Kiran Khatter
Sukhdev Singh
65
1,073
0
17 Aug 2017
Revisiting the Effectiveness of Off-the-shelf Temporal Modeling
  Approaches for Large-scale Video Classification
Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification
Yunlong Bian
Chuang Gan
Xiao-Chang Liu
Fu Li
Xiang Long
Yandong Li
Heng Qi
Jie Zhou
Shilei Wen
Yuanqing Lin
85
48
0
12 Aug 2017
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling
  for Visual Question Answering
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Jianping Fan
Dacheng Tao
95
461
0
10 Aug 2017
Recent Trends in Deep Learning Based Natural Language Processing
Recent Trends in Deep Learning Based Natural Language Processing
Tom Young
Devamanyu Hazarika
Soujanya Poria
Min Zhang
93
2,844
0
09 Aug 2017
Deep Architectures for Neural Machine Translation
Deep Architectures for Neural Machine Translation
Antonio Valerio Miceli Barone
Jindřich Helcl
Rico Sennrich
Barry Haddow
Alexandra Birch
88
112
0
24 Jul 2017
A Simple Neural Attentive Meta-Learner
A Simple Neural Attentive Meta-Learner
Nikhil Mishra
Mostafa Rohaninejad
Xi Chen
Pieter Abbeel
OOD
109
200
0
11 Jul 2017
Dual Supervised Learning
Dual Supervised Learning
Yingce Xia
Tao Qin
Wei-neng Chen
Jiang Bian
Nenghai Yu
Tie-Yan Liu
SSL
140
143
0
03 Jul 2017
VAIN: Attentional Multi-agent Predictive Modeling
VAIN: Attentional Multi-agent Predictive Modeling
Yedid Hoshen
GNN
103
240
0
19 Jun 2017
One Model To Learn Them All
One Model To Learn Them All
Lukasz Kaiser
Aidan Gomez
Noam M. Shazeer
Ashish Vaswani
Niki Parmar
Llion Jones
Jakob Uszkoreit
VLMViT
82
334
0
16 Jun 2017
Depthwise Separable Convolutions for Neural Machine Translation
Depthwise Separable Convolutions for Neural Machine Translation
Lukasz Kaiser
Aidan Gomez
François Chollet
74
279
0
09 Jun 2017
Jointly Learning Sentence Embeddings and Syntax with Unsupervised
  Tree-LSTMs
Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs
Jean Maillard
S. Clark
Dani Yogatama
77
89
0
25 May 2017
Recurrent Additive Networks
Recurrent Additive Networks
Kenton Lee
Omer Levy
Luke Zettlemoyer
GNNAI4CE
90
38
0
21 May 2017
Reinforced Mnemonic Reader for Machine Reading Comprehension
Reinforced Mnemonic Reader for Machine Reading Comprehension
Minghao Hu
Yuxing Peng
Zhen Huang
Xipeng Qiu
Furu Wei
Ming Zhou
RALMAIMat
97
69
0
08 May 2017
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine
  Translation Decoding on the CPU
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
Jacob Devlin
73
36
0
04 May 2017
Improving Neural Machine Translation with Conditional Sequence
  Generative Adversarial Nets
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
Zhen-Le Yang
Wei Chen
Feng Wang
Bo Xu
GANAI4CE
89
170
0
15 Mar 2017
Structured Attention Networks
Structured Attention Networks
Yoon Kim
Carl Denton
Luong Hoang
Alexander M. Rush
141
463
0
03 Feb 2017
Symbolic, Distributed and Distributional Representations for Natural
  Language Processing in the Era of Deep Learning: a Survey
Symbolic, Distributed and Distributional Representations for Natural Language Processing in the Era of Deep Learning: a Survey
L. Ferrone
Fabio Massimo Zanzotto
43
38
0
02 Feb 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRLVLM
307
1,548
0
25 Jan 2017
Boosting Neural Machine Translation
Boosting Neural Machine Translation
Dakun Zhang
Jungi Kim
Josep Crego
Jean Senellart
AI4CE
73
26
0
19 Dec 2016
Previous
123...537538539
Next