ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXivPDFHTML

Papers citing "Attention Is All You Need"

50 / 19,208 papers shown
Title
TauRieL: Targeting Traveling Salesman Problem with a deep reinforcement
  learning inspired architecture
TauRieL: Targeting Traveling Salesman Problem with a deep reinforcement learning inspired architecture
Gorker Alp Malazgirt
O. Unsal
A. Cristal
19
5
0
14 May 2019
Entity-Relation Extraction as Multi-Turn Question Answering
Entity-Relation Extraction as Multi-Turn Question Answering
Xiaoya Li
Fan Yin
Zijun Sun
Xiayu Li
Arianna Yuan
Duo Chai
Mingxin Zhou
Jiwei Li
35
346
0
14 May 2019
Effective Cross-lingual Transfer of Neural Machine Translation Models
  without Shared Vocabularies
Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
Yunsu Kim
Yingbo Gao
Hermann Ney
VLM
24
88
0
14 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
44
101
0
13 May 2019
Synchronous Bidirectional Neural Machine Translation
Synchronous Bidirectional Neural Machine Translation
Long Zhou
Jiajun Zhang
Chengqing Zong
22
106
0
13 May 2019
CoLight: Learning Network-level Cooperation for Traffic Signal Control
CoLight: Learning Network-level Cooperation for Traffic Signal Control
Hua Wei
Nan Xu
Huichu Zhang
Guanjie Zheng
Xinshi Zang
Chacha Chen
Weinan Zhang
Yanmin Zhu
Kai Xu
Z. Li
44
354
0
11 May 2019
Language-Conditioned Graph Networks for Relational Reasoning
Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
31
171
0
10 May 2019
A logical-based corpus for cross-lingual evaluation
A logical-based corpus for cross-lingual evaluation
Felipe Salvatore
Marcelo Finger
R. Hirata
21
1
0
10 May 2019
FastDraw: Addressing the Long Tail of Lane Detection by Adapting a
  Sequential Prediction Network
FastDraw: Addressing the Long Tail of Lane Detection by Adapting a Sequential Prediction Network
Jonah Philion
17
159
0
10 May 2019
Deep Unsupervised Cardinality Estimation
Deep Unsupervised Cardinality Estimation
Zongheng Yang
Eric Liang
Amog Kamsetty
Chenggang Wu
Yan Duan
Peter Chen
Pieter Abbeel
J. M. Hellerstein
S. Krishnan
Ion Stoica
29
203
0
10 May 2019
Language Modeling with Deep Transformers
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
46
171
0
10 May 2019
Region Attention Networks for Pose and Occlusion Robust Facial
  Expression Recognition
Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition
Kaidi Wang
Xiaojiang Peng
Jianfei Yang
Debin Meng
Yu Qiao
CVBM
29
601
0
10 May 2019
Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot
  Learning on Category Graph
Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph
Lu Liu
Dinesh Manocha
Guodong Long
Jing Jiang
Lina Yao
Chengqi Zhang
26
71
0
10 May 2019
Unified Language Model Pre-training for Natural Language Understanding
  and Generation
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu-Chiang Frank Wang
Jianfeng Gao
M. Zhou
H. Hon
ELM
AI4CE
80
1,551
0
08 May 2019
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data
  Augmentation
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation
Christoph Luscher
Eugen Beck
Kazuki Irie
M. Kitza
Wilfried Michel
Albert Zeyer
Ralf Schluter
Hermann Ney
VLM
13
234
0
08 May 2019
Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word
  Representations
Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations
Meishan Zhang
Zhenghua Li
Guohong Fu
Min Zhang
27
55
0
08 May 2019
Neural Architecture Refinement: A Practical Way for Avoiding Overfitting
  in NAS
Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS
Yangzhou Jiang
Cong Zhao
Zeyang Dou
Lei Pang
14
5
0
07 May 2019
Taming Pretrained Transformers for Extreme Multi-label Text
  Classification
Taming Pretrained Transformers for Extreme Multi-label Text Classification
Wei-Cheng Chang
Hsiang-Fu Yu
Kai Zhong
Yiming Yang
Inderjit Dhillon
27
20
0
07 May 2019
Investigating the Successes and Failures of BERT for Passage Re-Ranking
Investigating the Successes and Failures of BERT for Passage Re-Ranking
Harshith Padigela
Hamed Zamani
W. Bruce Croft
27
47
0
05 May 2019
Drug-Drug Adverse Effect Prediction with Graph Co-Attention
Drug-Drug Adverse Effect Prediction with Graph Co-Attention
Andreea Deac
Yu-Hsiang Huang
Petar Velickovic
Pietro Lio
Jian Tang
27
77
0
02 May 2019
Similarity of Neural Network Representations Revisited
Similarity of Neural Network Representations Revisited
Simon Kornblith
Mohammad Norouzi
Honglak Lee
Geoffrey E. Hinton
88
1,362
0
01 May 2019
Deep Learning for Audio Signal Processing
Deep Learning for Audio Signal Processing
Hendrik Purwins
Yue Liu
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
34
587
0
30 Apr 2019
Very Deep Self-Attention Networks for End-to-End Speech Recognition
Very Deep Self-Attention Networks for End-to-End Speech Recognition
Ngoc-Quan Pham
T. Nguyen
Jan Niehues
Markus Müller
Sebastian Stüker
A. Waibel
28
161
0
30 Apr 2019
Segmentation is All You Need
Segmentation is All You Need
Zehua Cheng
Yuxiang Wu
Zhenghua Xu
Thomas Lukasiewicz
Weiyan Wang
33
20
0
30 Apr 2019
Performing Structured Improvisations with pre-trained Deep Learning
  Models
Performing Structured Improvisations with pre-trained Deep Learning Models
Pablo Samuel Castro
BDL
27
10
0
30 Apr 2019
Incorporating Symbolic Sequential Modeling for Speech Enhancement
Incorporating Symbolic Sequential Modeling for Speech Enhancement
Chien-Feng Liao
Yu Tsao
Xugang Lu
Hisashi Kawai
27
18
0
30 Apr 2019
A self-attention based deep learning method for lesion attribute
  detection from CT reports
A self-attention based deep learning method for lesion attribute detection from CT reports
Yifan Peng
Ke Yan
V. Sandfort
Ronald M. Summers
Zhiyong Lu
MedIm
27
18
0
30 Apr 2019
Towards Coherent and Engaging Spoken Dialog Response Generation Using
  Automatic Conversation Evaluators
Towards Coherent and Engaging Spoken Dialog Response Generation Using Automatic Conversation Evaluators
Sanghyun Yi
Rahul Goel
Chandra Khatri
Alessandra Cervone
Tagyoung Chung
Behnam Hedayatnia
Anu Venkatesh
Raefer Gabriel
Dilek Z. Hakkani-Tür
20
60
0
30 Apr 2019
Graph Matching Networks for Learning the Similarity of Graph Structured
  Objects
Graph Matching Networks for Learning the Similarity of Graph Structured Objects
Yujia Li
Chenjie Gu
T. Dullien
Oriol Vinyals
Pushmeet Kohli
66
517
0
29 Apr 2019
Learning Meta Model for Zero- and Few-shot Face Anti-spoofing
Learning Meta Model for Zero- and Few-shot Face Anti-spoofing
Yunxiao Qin
Chenxu Zhao
Xiangyu Zhu
Zezheng Wang
Zitong Yu
Tianyu Fu
Feng Zhou
Jingping Shi
Zhen Lei
CVBM
24
115
0
29 Apr 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
31
227
0
25 Apr 2019
Local Relation Networks for Image Recognition
Local Relation Networks for Image Recognition
Han Hu
Zheng Zhang
Zhenda Xie
Stephen Lin
FAtt
32
498
0
25 Apr 2019
Spatial-Temporal Relation Networks for Multi-Object Tracking
Spatial-Temporal Relation Networks for Multi-Object Tracking
Jiarui Xu
Yue Cao
Zheng Zhang
Han Hu
VOT
68
238
0
25 Apr 2019
HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object
  Detection
HAR-Net: Joint Learning of Hybrid Attention for Single-stage Object Detection
Yali Li
Shengjin Wang
22
33
0
25 Apr 2019
Declarative Recursive Computation on an RDBMS, or, Why You Should Use a
  Database For Distributed Machine Learning
Declarative Recursive Computation on an RDBMS, or, Why You Should Use a Database For Distributed Machine Learning
Dimitrije Jankov
Shangyu Luo
Binhang Yuan
Zhuhua Cai
Jia Zou
C. Jermaine
Zekai J. Gao
26
60
0
25 Apr 2019
How You Act Tells a Lot: Privacy-Leakage Attack on Deep Reinforcement
  Learning
How You Act Tells a Lot: Privacy-Leakage Attack on Deep Reinforcement Learning
Xinlei Pan
Weiyao Wang
Xiaoshuai Zhang
Yue Liu
Jinfeng Yi
D. Song
MIACV
77
26
0
24 Apr 2019
The Scientific Method in the Science of Machine Learning
The Scientific Method in the Science of Machine Learning
Jessica Zosa Forde
Michela Paganini
24
35
0
24 Apr 2019
Computer-aided diagnosis in histopathological images of the endometrium
  using a convolutional neural network and attention mechanisms
Computer-aided diagnosis in histopathological images of the endometrium using a convolutional neural network and attention mechanisms
Hao Sun
Xianxu Zeng
Tao Xu
G. Peng
Yutao Ma
37
92
0
24 Apr 2019
Generating Long Sequences with Sparse Transformers
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
36
1,854
0
23 Apr 2019
Exploring Structure-Adaptive Graph Learning for Robust Semi-Supervised
  Classification
Exploring Structure-Adaptive Graph Learning for Robust Semi-Supervised Classification
Xiang Gao
Wei Hu
Zongming Guo
GNN
20
1
0
23 Apr 2019
Attention Augmented Convolutional Networks
Attention Augmented Convolutional Networks
Irwan Bello
Barret Zoph
Ashish Vaswani
Jonathon Shlens
Quoc V. Le
46
999
0
22 Apr 2019
Good-Enough Compositional Data Augmentation
Good-Enough Compositional Data Augmentation
Jacob Andreas
30
230
0
21 Apr 2019
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for
  Natural Language Understanding
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
FedML
13
181
0
20 Apr 2019
Language Models with Transformers
Language Models with Transformers
Chenguang Wang
Mu Li
Alex Smola
20
121
0
20 Apr 2019
Mask-Predict: Parallel Decoding of Conditional Masked Language Models
Mask-Predict: Parallel Decoding of Conditional Masked Language Models
Marjan Ghazvininejad
Omer Levy
Yinhan Liu
Luke Zettlemoyer
MoE
27
35
0
19 Apr 2019
Context-Aware Zero-Shot Recognition
Context-Aware Zero-Shot Recognition
Ruotian Luo
Ning Zhang
Bohyung Han
L. Yang
25
31
0
19 Apr 2019
ERNIE: Enhanced Representation through Knowledge Integration
ERNIE: Enhanced Representation through Knowledge Integration
Yu Sun
Shuohuan Wang
Yukun Li
Shikun Feng
Xuyi Chen
Han Zhang
Xin Tian
Danxiang Zhu
Hao Tian
Hua Wu
79
895
0
19 Apr 2019
Code-Switching for Enhancing NMT with Pre-Specified Translation
Code-Switching for Enhancing NMT with Pre-Specified Translation
Kai Song
Yue Zhang
Heng Yu
Weihua Luo
Kun Wang
Min Zhang
35
116
0
19 Apr 2019
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Shijie Wu
Mark Dredze
VLM
SSeg
44
670
0
19 Apr 2019
Learning to Collocate Neural Modules for Image Captioning
Learning to Collocate Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Jianfei Cai
27
77
0
18 Apr 2019
Previous
123...373374375...383384385
Next