ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 19,025 papers shown
Title
CCNet: Criss-Cross Attention for Semantic Segmentation
CCNet: Criss-Cross Attention for Semantic Segmentation
Zilong Huang
Xinggang Wang
Yunchao Wei
Lichao Huang
Humphrey Shi
Wenyu Liu
Chang Huang
VOS
38
2,523
0
28 Nov 2018
Neural Sign Language Translation based on Human Keypoint Estimation
Neural Sign Language Translation based on Human Keypoint Estimation
Sang-Ki Ko
Chang Jo Kim
Hyedong Jung
Choongsang Cho
SLR
35
207
0
28 Nov 2018
Sequential Variational Autoencoders for Collaborative Filtering
Sequential Variational Autoencoders for Collaborative Filtering
Noveen Sachdeva
Giuseppe Manco
Ettore Ritacco
Vikram Pudi
BDL
18
102
0
25 Nov 2018
Deep Learning Inference in Facebook Data Centers: Characterization,
  Performance Optimizations and Hardware Implications
Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications
Jongsoo Park
Maxim Naumov
Protonu Basu
Summer Deng
Aravind Kalaiah
...
Lin Qiao
Vijay Rao
Nadav Rotem
S. Yoo
M. Smelyanskiy
FedML
GNN
BDL
20
187
0
24 Nov 2018
Latent Dirichlet Allocation with Residual Convolutional Neural Network
  Applied in Evaluating Credibility of Chinese Listed Companies
Latent Dirichlet Allocation with Residual Convolutional Neural Network Applied in Evaluating Credibility of Chinese Listed Companies
Mohan Zhang
Z. Luo
Hai Lu
35
1
0
24 Nov 2018
Connecting the Dots Between MLE and RL for Sequence Prediction
Connecting the Dots Between MLE and RL for Sequence Prediction
Bowen Tan
Zhiting Hu
Zichao Yang
Ruslan Salakhutdinov
Eric Xing
28
24
0
24 Nov 2018
Contextualized Non-local Neural Networks for Sequence Learning
Contextualized Non-local Neural Networks for Sequence Learning
Pengfei Liu
Shuaichen Chang
Xuanjing Huang
Jian Tang
Jackie C.K. Cheung
25
47
0
21 Nov 2018
Sequential Image-based Attention Network for Inferring Force Estimation
  without Haptic Sensor
Sequential Image-based Attention Network for Inferring Force Estimation without Haptic Sensor
Hochul Shin
Hyeon Cho
Dongyi Kim
Dae-Kwan Ko
Soo-Chul Lim
Wonjun Hwang
23
18
0
17 Nov 2018
Hierarchical Bipartite Graph Convolution Networks
Hierarchical Bipartite Graph Convolution Networks
Marcel Nassar
GNN
17
18
0
17 Nov 2018
Relational Long Short-Term Memory for Video Action Recognition
Relational Long Short-Term Memory for Video Action Recognition
Zexi Chen
B. Ramachandra
Tianfu Wu
Ranga Raju Vatsavai
24
5
0
16 Nov 2018
Generating Responses Expressing Emotion in an Open-domain Dialogue
  System
Generating Responses Expressing Emotion in an Open-domain Dialogue System
Chenyang Huang
Osmar R. Zaïane
28
3
0
15 Nov 2018
Learning to Predict the Cosmological Structure Formation
Learning to Predict the Cosmological Structure Formation
Siyu He
Yin Li
Yu Feng
S. Ho
Siamak Ravanbakhsh
Wei Chen
Barnabás Póczós
28
168
0
15 Nov 2018
LinkNet: Relational Embedding for Scene Graph
LinkNet: Relational Embedding for Scene Graph
Sanghyun Woo
Dahun Kim
Donghyeon Cho
In So Kweon
GNN
15
147
0
15 Nov 2018
Selective Feature Connection Mechanism: Concatenating Multi-layer CNN
  Features with a Feature Selector
Selective Feature Connection Mechanism: Concatenating Multi-layer CNN Features with a Feature Selector
Chen Du
Chunheng Wang
Yanna Wang
Cunzhao Shi
Baihua Xiao
27
42
0
15 Nov 2018
Machine Learning for Combinatorial Optimization: a Methodological Tour
  d'Horizon
Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon
Yoshua Bengio
Andrea Lodi
Antoine Prouvost
89
1,356
0
15 Nov 2018
Translating a Math Word Problem to an Expression Tree
Translating a Math Word Problem to an Expression Tree
Lei Wang
Yan Wang
Deng Cai
Dongxiang Zhang
Xiaojiang Liu
AIMat
19
157
0
14 Nov 2018
FusionStitching: Deep Fusion and Code Generation for Tensorflow
  Computations on GPUs
FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs
Guoping Long
Jun Yang
Kai Zhu
Wei Lin
19
9
0
13 Nov 2018
Multi-encoder multi-resolution framework for end-to-end speech
  recognition
Multi-encoder multi-resolution framework for end-to-end speech recognition
Ruizhi Li
Xiaofei Wang
Sri Harish Reddy Mallidi
Takaaki Hori
Shinji Watanabe
H. Hermansky
22
13
0
12 Nov 2018
An Introductory Survey on Attention Mechanisms in NLP Problems
An Introductory Survey on Attention Mechanisms in NLP Problems
Dichao Hu
AIMat
27
246
0
12 Nov 2018
End-to-End Non-Autoregressive Neural Machine Translation with
  Connectionist Temporal Classification
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Jindrich Libovický
Jindřich Helcl
27
167
0
12 Nov 2018
Input Combination Strategies for Multi-Source Transformer Decoder
Input Combination Strategies for Multi-Source Transformer Decoder
Jindrich Libovický
Jindřich Helcl
David Marecek
32
73
0
12 Nov 2018
CUNI System for the WMT18 Multimodal Translation Task
CUNI System for the WMT18 Multimodal Translation Task
Jindřich Helcl
Jindrich Libovický
Dušan Variš
16
57
0
12 Nov 2018
Holistic Multi-modal Memory Network for Movie Question Answering
Holistic Multi-modal Memory Network for Movie Question Answering
Anran Wang
Anh Tuan Luu
Chuan-Sheng Foo
Erik Cambria
Yi Tay
V. Chandrasekhar
36
20
0
12 Nov 2018
Sequence-Level Knowledge Distillation for Model Compression of
  Attention-based Sequence-to-Sequence Speech Recognition
Sequence-Level Knowledge Distillation for Model Compression of Attention-based Sequence-to-Sequence Speech Recognition
Raden Muáz Muním
Nakamasa Inoue
Koichi Shinoda
30
25
0
12 Nov 2018
An initial attempt of combining visual selective attention with deep
  reinforcement learning
An initial attempt of combining visual selective attention with deep reinforcement learning
Liu Yuezhang
Ruohan Zhang
D. Ballard
32
20
0
11 Nov 2018
Scene Text Detection and Recognition: The Deep Learning Era
Scene Text Detection and Recognition: The Deep Learning Era
Shangbang Long
Xin He
Cong Yao
VLM
49
390
0
10 Nov 2018
Skeleton-Based Action Recognition with Synchronous Local and Non-local
  Spatio-temporal Learning and Frequency Attention
Skeleton-Based Action Recognition with Synchronous Local and Non-local Spatio-temporal Learning and Frequency Attention
Guyue Hu
Bo Cui
Shan Yu
21
40
0
10 Nov 2018
Speech Intention Understanding in a Head-final Language: A
  Disambiguation Utilizing Intonation-dependency
Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency
Won Ik Cho
Hyeon Seung Lee
J. Yoon
Seokhwan Kim
N. Kim
44
5
0
10 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and
  Context Preservation Mechanisms
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
19
111
0
09 Nov 2018
The RLLChatbot: a solution to the ConvAI challenge
The RLLChatbot: a solution to the ConvAI challenge
Nicolas Angelard-Gontier
Koustuv Sinha
Peter Henderson
Iulian Serban
Michael Noseworthy
Prasanna Parthasarathi
Joelle Pineau
OffRL
33
0
0
07 Nov 2018
Molecular Transformer - A Model for Uncertainty-Calibrated Chemical
  Reaction Prediction
Molecular Transformer - A Model for Uncertainty-Calibrated Chemical Reaction Prediction
P. Schwaller
Teodoro Laino
John McGuinness
A. Horváth
Constantine Bekas
A. Lee
41
721
0
06 Nov 2018
Robust and fine-grained prosody control of end-to-end speech synthesis
Robust and fine-grained prosody control of end-to-end speech synthesis
Younggun Lee
Jonathan Le Roux
11
147
0
06 Nov 2018
End-to-End Monaural Multi-speaker ASR System without Pretraining
End-to-End Monaural Multi-speaker ASR System without Pretraining
Xuankai Chang
Y. Qian
Yi Liang
Deming Chen
27
76
0
05 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text
  Translation
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
31
159
0
05 Nov 2018
Compact Personalized Models for Neural Machine Translation
Compact Personalized Models for Neural Machine Translation
Joern Wuebker
A. Paz
N. Ravid
VLM
21
56
0
05 Nov 2018
Structured Neural Summarization
Structured Neural Summarization
Patrick Fernandes
Miltiadis Allamanis
Marc Brockschmidt
GNN
27
212
0
05 Nov 2018
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
Hirokazu Kameoka
Kou Tanaka
Damian Kwaśny
Takuhiro Kaneko
Nobukatsu Hojo
36
62
0
05 Nov 2018
RA-UNet: A hybrid deep attention-aware network to extract liver and
  tumor in CT scans
RA-UNet: A hybrid deep attention-aware network to extract liver and tumor in CT scans
Qiangguo Jin
Zhao-Peng Meng
Changming Sun
Leyi Wei
R. Su
MedIm
32
354
0
04 Nov 2018
Wizard of Wikipedia: Knowledge-Powered Conversational agents
Wizard of Wikipedia: Knowledge-Powered Conversational agents
Emily Dinan
Stephen Roller
Kurt Shuster
Angela Fan
Michael Auli
Jason Weston
RALM
KELM
53
935
0
03 Nov 2018
Identifying and Controlling Important Neurons in Neural Machine
  Translation
Identifying and Controlling Important Neurons in Neural Machine Translation
A. Bau
Yonatan Belinkov
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
James R. Glass
MILM
21
180
0
03 Nov 2018
Transfer Learning in Multilingual Neural Machine Translation with
  Dynamic Vocabulary
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
Surafel Melaku Lakew
A. Erofeeva
Matteo Negri
Marcello Federico
Marco Turchi
26
62
0
03 Nov 2018
Sentence Encoders on STILTs: Supplementary Training on Intermediate
  Labeled-data Tasks
Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks
Jason Phang
Thibault Févry
Samuel R. Bowman
33
467
0
02 Nov 2018
Neural Machine Translation into Language Varieties
Neural Machine Translation into Language Varieties
Surafel Melaku Lakew
A. Erofeeva
Marcello Federico
33
49
0
02 Nov 2018
Image Chat: Engaging Grounded Conversations
Image Chat: Engaging Grounded Conversations
Kurt Shuster
Samuel Humeau
Antoine Bordes
Jason Weston
23
115
0
02 Nov 2018
CommonsenseQA: A Question Answering Challenge Targeting Commonsense
  Knowledge
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
75
1,623
0
02 Nov 2018
Importance of Search and Evaluation Strategies in Neural Dialogue
  Modeling
Importance of Search and Evaluation Strategies in Neural Dialogue Modeling
Ilia Kulikov
Alexander H. Miller
Kyunghyun Cho
Jason Weston
32
83
0
02 Nov 2018
Abstractive Summarization of Reddit Posts with Multi-level Memory
  Networks
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks
Byeongchang Kim
Hyunwoo J. Kim
Gunhee Kim
31
182
0
02 Nov 2018
Language-Independent Representor for Neural Machine Translation
Language-Independent Representor for Neural Machine Translation
Long Zhou
Yuchen Liu
Jiajun Zhang
Chengqing Zong
Guoping Huang
33
1
0
01 Nov 2018
Hybrid Self-Attention Network for Machine Translation
Hybrid Self-Attention Network for Machine Translation
Kaitao Song
Tan Xu
Furong Peng
Jianfeng Lu
21
12
0
01 Nov 2018
Towards Empathetic Open-domain Conversation Models: a New Benchmark and
  Dataset
Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset
Hannah Rashkin
Eric Michael Smith
Margaret Li
Y-Lan Boureau
VLM
17
48
0
01 Nov 2018
Previous
123...374375376...379380381
Next