Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 18,593 papers shown
Title
Action Anticipation for Collaborative Environments: The Impact of Contextual Information and Uncertainty-Based Prediction
Clebeson Canuto dos Santos
Plinio Moreno
J. L. A. Samatelo
R. Vassallo
J. Santos-Victor
25
7
0
01 Oct 2019
A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition
Jonathan Chung
Thomas Delteil
24
31
0
01 Oct 2019
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum
Jianyu Wang
Vinayak Tantia
Nicolas Ballas
Michael G. Rabbat
12
200
0
01 Oct 2019
Predicting materials properties without crystal structure: Deep representation learning from stoichiometry
Rhys E. A. Goodall
A. Lee
21
254
0
01 Oct 2019
Dialogue Transformers
Vladimir Vlasov
Johannes E. M. Mosig
Alan Nichol
27
56
0
01 Oct 2019
Grammatical Error Correction in Low-Resource Scenarios
Jakub Náplava
Milan Straka
13
55
0
01 Oct 2019
When and Why is Document-level Context Useful in Neural Machine Translation?
Yunsu Kim
Thanh-Hai Tran
Hermann Ney
19
84
0
01 Oct 2019
Multilingual End-to-End Speech Translation
Hirofumi Inaguma
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
LRM
28
86
0
01 Oct 2019
Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations
Christian Hadiwinoto
Hwee Tou Ng
Wee Chung Gan
22
83
0
01 Oct 2019
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing
Kevin Duarte
Yogesh S Rawat
M. Shah
VOS
16
68
0
30 Sep 2019
Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations
Po-Yao (Bernie) Huang
Xiaojun Chang
Alexander G. Hauptmann
30
25
0
30 Sep 2019
Revisiting Self-Training for Neural Sequence Generation
Junxian He
Jiatao Gu
Jiajun Shen
MarcÁurelio Ranzato
SSL
LRM
244
270
0
30 Sep 2019
A Closer Look at Data Bias in Neural Extractive Summarization Models
Ming Zhong
Danqing Wang
Pengfei Liu
Xipeng Qiu
Xuanjing Huang
48
42
0
30 Sep 2019
Chameleon: Learning Model Initializations Across Tasks With Different Schemas
L. Brinkmeyer
Rafael Rêgo Drumond
Randolf Scholz
Josif Grabocka
Lars Schmidt-Thieme
CLL
19
8
0
30 Sep 2019
Lane Attention: Predicting Vehicles' Moving Trajectories by Learning Their Attention over Lanes
Jiacheng Pan
Hongyi Sun
Kecheng Xu
Yifei Jiang
Xiangquan Xiao
Jiangtao Hu
Jinghao Miao
19
35
0
29 Sep 2019
How to Evaluate Machine Learning Approaches for Combinatorial Optimization: Application to the Travelling Salesman Problem
Antoine François
Quentin Cappart
Louis-Martin Rousseau
22
13
0
28 Sep 2019
Self-Attention Transducers for End-to-End Speech Recognition
Zhengkun Tian
Jiangyan Yi
J. Tao
Ye Bai
Zhengqi Wen
AI4TS
29
70
0
28 Sep 2019
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
Reuben Tan
Huijuan Xu
Kate Saenko
Bryan A. Plummer
28
67
0
27 Sep 2019
On the use of BERT for Neural Machine Translation
S. Clinchant
K. Jung
Vassilina Nikoulina
27
89
0
27 Sep 2019
A Constructive Prediction of the Generalization Error Across Scales
Jonathan S. Rosenfeld
Amir Rosenfeld
Yonatan Belinkov
Nir Shavit
36
207
0
27 Sep 2019
Multi-Agent Actor-Critic with Hierarchical Graph Attention Network
Heechang Ryu
Hayong Shin
Jinkyoo Park
25
115
0
27 Sep 2019
Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture
Ashok Thillaisundaram
Theodosia Togia
24
17
0
26 Sep 2019
Monotonic Multihead Attention
Xutai Ma
J. Pino
James Cross
Liezl Puzon
Jiatao Gu
30
137
0
26 Sep 2019
Set Functions for Time Series
Max Horn
Michael Moor
Christian Bock
Bastian Alexander Rieck
Karsten M. Borgwardt
AI4TS
38
146
0
26 Sep 2019
Towards Understanding the Transferability of Deep Representations
Hong Liu
Mingsheng Long
Jianmin Wang
Michael I. Jordan
30
25
0
26 Sep 2019
Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation
Ze Yang
Can Xu
Wei Wu
Zhoujun Li
3DV
23
29
0
26 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
112
6,380
0
26 Sep 2019
A Refined Equilibrium Generative Adversarial Network for Retinal Vessel Segmentation
Yukun Zhou
Zailiang Chen
Hai-lan Shen
Xianxian Zheng
Rongchang Zhao
Xuanchu Duan
GAN
MedIm
22
48
0
26 Sep 2019
Universal Graph Transformer Self-Attention Networks
Dai Quoc Nguyen
T. Nguyen
Dinh Q. Phung
ViT
34
63
0
26 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
43
584
0
25 Sep 2019
Gated Channel Transformation for Visual Recognition
Zongxin Yang
Linchao Zhu
Yu Wu
Yezhou Yang
ViT
22
203
0
25 Sep 2019
Synthetic Data for Deep Learning
Sergey I. Nikolenko
46
348
0
25 Sep 2019
EEG-Based Driver Drowsiness Estimation Using Feature Weighted Episodic Training
Yuqi Cui
Yifan Xu
Dongrui Wu
19
62
0
25 Sep 2019
A Survey of Binary Code Similarity
I. Haq
Juan Caballero
16
134
0
25 Sep 2019
Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization
Ruyi Ji
Longyin Wen
Libo Zhang
Dawei Du
Ynajun Wu
Chen Zhao
Xianglong Liu
Feiyue Huang
26
163
0
25 Sep 2019
Tackling Long-Tailed Relations and Uncommon Entities in Knowledge Graph Completion
Zihao Wang
K. Lai
Piji Li
Lidong Bing
W. Lam
19
32
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
249
208
0
25 Sep 2019
TalkDown: A Corpus for Condescension Detection in Context
Zijian Wang
Christopher Potts
16
51
0
25 Sep 2019
Improving Noise Robustness In Speaker Identification Using A Two-Stage Attention Model
Yanpei Shi
Qiang Huang
Thomas Hain
30
1
0
24 Sep 2019
Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation
Yuhui Yuan
Xiaokang Chen
Xilin Chen
Jingdong Wang
ViT
49
1,403
0
24 Sep 2019
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
Talk2Car: Taking Control of Your Self-Driving Car
Thierry Deruyttere
Simon Vandenhende
Dusan Grujicic
Luc Van Gool
Marie-Francine Moens
LM&Ro
31
124
0
24 Sep 2019
6D Pose Estimation with Correlation Fusion
Yi Cheng
Erik Cambria
Ying Sun
C. Acar
Wei Jing
Yan Wu
Liyuan Li
Cheston Tan
Joo-Hwee Lim
45
15
0
24 Sep 2019
Knowledge-Enriched Transformer for Emotion Detection in Textual Conversations
Peixiang Zhong
Di Wang
Chunyan Miao
24
269
0
24 Sep 2019
Cross-Lingual Natural Language Generation via Pre-Training
Zewen Chi
Li Dong
Furu Wei
Wenhui Wang
Xian-Ling Mao
Heyan Huang
27
136
0
23 Sep 2019
On Model Stability as a Function of Random Seed
Pranava Madhyastha
Dhruv Batra
45
62
0
23 Sep 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings
Gregor Wiedemann
Steffen Remus
Avi Chawla
Chris Biemann
27
174
0
23 Sep 2019
Self-attention based end-to-end Hindi-English Neural Machine Translation
Siddhant Srivastava
Ritu Tiwari
9
2
0
21 Sep 2019
Scale MLPerf-0.6 models on Google TPU-v3 Pods
Sameer Kumar
Victor Bitorff
Dehao Chen
Chi-Heng Chou
Blake A. Hechtman
...
Peter Mattson
Shibo Wang
Tao Wang
Yuanzhong Xu
Zongwei Zhou
10
39
0
21 Sep 2019
Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages
Yunsu Kim
P. Petrov
Pavel Petrushkov
Shahram Khadivi
Hermann Ney
LRM
50
81
0
20 Sep 2019
Previous
1
2
3
...
353
354
355
...
370
371
372
Next