ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 18,803 papers shown
Title
Contextualized Non-local Neural Networks for Sequence Learning
Contextualized Non-local Neural Networks for Sequence Learning
Pengfei Liu
Shuaichen Chang
Xuanjing Huang
Jian Tang
Jackie C.K. Cheung
25
47
0
21 Nov 2018
Sequential Image-based Attention Network for Inferring Force Estimation
  without Haptic Sensor
Sequential Image-based Attention Network for Inferring Force Estimation without Haptic Sensor
Hochul Shin
Hyeon Cho
Dongyi Kim
Dae-Kwan Ko
Soo-Chul Lim
Wonjun Hwang
23
18
0
17 Nov 2018
Hierarchical Bipartite Graph Convolution Networks
Hierarchical Bipartite Graph Convolution Networks
Marcel Nassar
GNN
17
18
0
17 Nov 2018
Relational Long Short-Term Memory for Video Action Recognition
Relational Long Short-Term Memory for Video Action Recognition
Zexi Chen
B. Ramachandra
Tianfu Wu
Ranga Raju Vatsavai
24
5
0
16 Nov 2018
Generating Responses Expressing Emotion in an Open-domain Dialogue
  System
Generating Responses Expressing Emotion in an Open-domain Dialogue System
Chenyang Huang
Osmar R. Zaïane
28
3
0
15 Nov 2018
Learning to Predict the Cosmological Structure Formation
Learning to Predict the Cosmological Structure Formation
Siyu He
Yin Li
Yu Feng
S. Ho
Siamak Ravanbakhsh
Wei Chen
Barnabás Póczós
28
168
0
15 Nov 2018
LinkNet: Relational Embedding for Scene Graph
LinkNet: Relational Embedding for Scene Graph
Sanghyun Woo
Dahun Kim
Donghyeon Cho
In So Kweon
GNN
15
147
0
15 Nov 2018
Selective Feature Connection Mechanism: Concatenating Multi-layer CNN
  Features with a Feature Selector
Selective Feature Connection Mechanism: Concatenating Multi-layer CNN Features with a Feature Selector
Chen Du
Chunheng Wang
Yanna Wang
Cunzhao Shi
Baihua Xiao
22
42
0
15 Nov 2018
Machine Learning for Combinatorial Optimization: a Methodological Tour
  d'Horizon
Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon
Yoshua Bengio
Andrea Lodi
Antoine Prouvost
89
1,354
0
15 Nov 2018
Translating a Math Word Problem to an Expression Tree
Translating a Math Word Problem to an Expression Tree
Lei Wang
Yan Wang
Deng Cai
Dongxiang Zhang
Xiaojiang Liu
AIMat
19
157
0
14 Nov 2018
FusionStitching: Deep Fusion and Code Generation for Tensorflow
  Computations on GPUs
FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs
Guoping Long
Jun Yang
Kai Zhu
Wei Lin
9
9
0
13 Nov 2018
Multi-encoder multi-resolution framework for end-to-end speech
  recognition
Multi-encoder multi-resolution framework for end-to-end speech recognition
Ruizhi Li
Xiaofei Wang
Sri Harish Reddy Mallidi
Takaaki Hori
Shinji Watanabe
H. Hermansky
22
13
0
12 Nov 2018
An Introductory Survey on Attention Mechanisms in NLP Problems
An Introductory Survey on Attention Mechanisms in NLP Problems
Dichao Hu
AIMat
27
246
0
12 Nov 2018
End-to-End Non-Autoregressive Neural Machine Translation with
  Connectionist Temporal Classification
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Jindrich Libovický
Jindřich Helcl
27
167
0
12 Nov 2018
Input Combination Strategies for Multi-Source Transformer Decoder
Input Combination Strategies for Multi-Source Transformer Decoder
Jindrich Libovický
Jindřich Helcl
David Marecek
27
73
0
12 Nov 2018
CUNI System for the WMT18 Multimodal Translation Task
CUNI System for the WMT18 Multimodal Translation Task
Jindřich Helcl
Jindrich Libovický
Dušan Variš
16
57
0
12 Nov 2018
Holistic Multi-modal Memory Network for Movie Question Answering
Holistic Multi-modal Memory Network for Movie Question Answering
Anran Wang
Anh Tuan Luu
Chuan-Sheng Foo
Erik Cambria
Yi Tay
V. Chandrasekhar
36
20
0
12 Nov 2018
Sequence-Level Knowledge Distillation for Model Compression of
  Attention-based Sequence-to-Sequence Speech Recognition
Sequence-Level Knowledge Distillation for Model Compression of Attention-based Sequence-to-Sequence Speech Recognition
Raden Muáz Muním
Nakamasa Inoue
Koichi Shinoda
30
25
0
12 Nov 2018
An initial attempt of combining visual selective attention with deep
  reinforcement learning
An initial attempt of combining visual selective attention with deep reinforcement learning
Liu Yuezhang
Ruohan Zhang
D. Ballard
20
20
0
11 Nov 2018
Scene Text Detection and Recognition: The Deep Learning Era
Scene Text Detection and Recognition: The Deep Learning Era
Shangbang Long
Xin He
Cong Yao
VLM
44
390
0
10 Nov 2018
Skeleton-Based Action Recognition with Synchronous Local and Non-local
  Spatio-temporal Learning and Frequency Attention
Skeleton-Based Action Recognition with Synchronous Local and Non-local Spatio-temporal Learning and Frequency Attention
Guyue Hu
Bo Cui
Shan Yu
19
40
0
10 Nov 2018
Speech Intention Understanding in a Head-final Language: A
  Disambiguation Utilizing Intonation-dependency
Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency
Won Ik Cho
Hyeon Seung Lee
J. Yoon
Seokhwan Kim
N. Kim
39
5
0
10 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and
  Context Preservation Mechanisms
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
19
111
0
09 Nov 2018
The RLLChatbot: a solution to the ConvAI challenge
The RLLChatbot: a solution to the ConvAI challenge
Nicolas Angelard-Gontier
Koustuv Sinha
Peter Henderson
Iulian Serban
Michael Noseworthy
Prasanna Parthasarathi
Joelle Pineau
OffRL
33
0
0
07 Nov 2018
Molecular Transformer - A Model for Uncertainty-Calibrated Chemical
  Reaction Prediction
Molecular Transformer - A Model for Uncertainty-Calibrated Chemical Reaction Prediction
P. Schwaller
Teodoro Laino
John McGuinness
A. Horváth
Constantine Bekas
A. Lee
36
719
0
06 Nov 2018
Robust and fine-grained prosody control of end-to-end speech synthesis
Robust and fine-grained prosody control of end-to-end speech synthesis
Younggun Lee
Jonathan Le Roux
11
147
0
06 Nov 2018
End-to-End Monaural Multi-speaker ASR System without Pretraining
End-to-End Monaural Multi-speaker ASR System without Pretraining
Xuankai Chang
Y. Qian
Yi Liang
Deming Chen
27
76
0
05 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text
  Translation
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
31
159
0
05 Nov 2018
Compact Personalized Models for Neural Machine Translation
Compact Personalized Models for Neural Machine Translation
Joern Wuebker
A. Paz
N. Ravid
VLM
11
56
0
05 Nov 2018
Structured Neural Summarization
Structured Neural Summarization
Patrick Fernandes
Miltiadis Allamanis
Marc Brockschmidt
GNN
27
212
0
05 Nov 2018
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
Hirokazu Kameoka
Kou Tanaka
Damian Kwaśny
Takuhiro Kaneko
Nobukatsu Hojo
33
62
0
05 Nov 2018
RA-UNet: A hybrid deep attention-aware network to extract liver and
  tumor in CT scans
RA-UNet: A hybrid deep attention-aware network to extract liver and tumor in CT scans
Qiangguo Jin
Zhao-Peng Meng
Changming Sun
Leyi Wei
R. Su
MedIm
27
354
0
04 Nov 2018
Wizard of Wikipedia: Knowledge-Powered Conversational agents
Wizard of Wikipedia: Knowledge-Powered Conversational agents
Emily Dinan
Stephen Roller
Kurt Shuster
Angela Fan
Michael Auli
Jason Weston
RALM
KELM
53
935
0
03 Nov 2018
Identifying and Controlling Important Neurons in Neural Machine
  Translation
Identifying and Controlling Important Neurons in Neural Machine Translation
A. Bau
Yonatan Belinkov
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
James R. Glass
MILM
21
180
0
03 Nov 2018
Transfer Learning in Multilingual Neural Machine Translation with
  Dynamic Vocabulary
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
Surafel Melaku Lakew
A. Erofeeva
Matteo Negri
Marcello Federico
Marco Turchi
24
62
0
03 Nov 2018
Sentence Encoders on STILTs: Supplementary Training on Intermediate
  Labeled-data Tasks
Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks
Jason Phang
Thibault Févry
Samuel R. Bowman
33
467
0
02 Nov 2018
Neural Machine Translation into Language Varieties
Neural Machine Translation into Language Varieties
Surafel Melaku Lakew
A. Erofeeva
Marcello Federico
28
49
0
02 Nov 2018
Image Chat: Engaging Grounded Conversations
Image Chat: Engaging Grounded Conversations
Kurt Shuster
Samuel Humeau
Antoine Bordes
Jason Weston
23
115
0
02 Nov 2018
CommonsenseQA: A Question Answering Challenge Targeting Commonsense
  Knowledge
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
40
1,623
0
02 Nov 2018
Importance of Search and Evaluation Strategies in Neural Dialogue
  Modeling
Importance of Search and Evaluation Strategies in Neural Dialogue Modeling
Ilia Kulikov
Alexander H. Miller
Kyunghyun Cho
Jason Weston
32
83
0
02 Nov 2018
Abstractive Summarization of Reddit Posts with Multi-level Memory
  Networks
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
Byeongchang Kim
Hyunwoo J. Kim
Gunhee Kim
23
182
0
02 Nov 2018
Language-Independent Representor for Neural Machine Translation
Language-Independent Representor for Neural Machine Translation
Long Zhou
Yuchen Liu
Jiajun Zhang
Chengqing Zong
Guoping Huang
33
1
0
01 Nov 2018
Hybrid Self-Attention Network for Machine Translation
Hybrid Self-Attention Network for Machine Translation
Kaitao Song
Tan Xu
Furong Peng
Jianfeng Lu
21
12
0
01 Nov 2018
Towards Empathetic Open-domain Conversation Models: a New Benchmark and
  Dataset
Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset
Hannah Rashkin
Eric Michael Smith
Margaret Li
Y-Lan Boureau
VLM
17
48
0
01 Nov 2018
Towards Explainable NLP: A Generative Explanation Framework for Text
  Classification
Towards Explainable NLP: A Generative Explanation Framework for Text Classification
Hui Liu
Qingyu Yin
William Yang Wang
27
148
0
01 Nov 2018
Dial2Desc: End-to-end Dialogue Description Generation
Dial2Desc: End-to-end Dialogue Description Generation
Haojie Pan
Junpei Zhou
Zhou Zhao
Yan Liu
Deng Cai
Min Yang
VLM
18
14
0
01 Nov 2018
A Regularized Attention Mechanism for Graph Attention Networks
A Regularized Attention Mechanism for Graph Attention Networks
U. Shanthamallu
Jayaraman J. Thiagarajan
A. Spanias
OOD
GNN
19
17
0
01 Nov 2018
Improving Machine Reading Comprehension with General Reading Strategies
Improving Machine Reading Comprehension with General Reading Strategies
Kai Sun
Dian Yu
Dong Yu
Claire Cardie
AI4CE
24
116
0
31 Oct 2018
You May Not Need Attention
You May Not Need Attention
Ofir Press
Noah A. Smith
19
27
0
31 Oct 2018
Hybrid Knowledge Routed Modules for Large-scale Object Detection
Hybrid Knowledge Routed Modules for Large-scale Object Detection
Chenhan Jiang
Hang Xu
Xiangdan Liang
Liang Lin
VLM
ObjD
39
86
0
30 Oct 2018
Previous
123...370371372...375376377
Next