Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 18,593 papers shown
Title
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
30
28
0
30 Jun 2019
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Jie Cao
Michael J. Tanana
Zac E. Imel
E. Poitras
David C. Atkins
Vivek Srikumar
OffRL
31
55
0
30 Jun 2019
Deep Gamblers: Learning to Abstain with Portfolio Theory
Liu Ziyin
Zhikang T. Wang
Paul Pu Liang
Ruslan Salakhutdinov
Louis-Philippe Morency
Masahito Ueda
26
111
0
29 Jun 2019
Frame attention networks for facial expression recognition in videos
Debin Meng
Xiaojiang Peng
Kai Wang
Yu Qiao
CVBM
19
169
0
29 Jun 2019
Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings
Marcely Zanon Boito
Aline Villavicencio
Laurent Besacier
17
8
0
29 Jun 2019
Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks
Zhu Zhang
Zhou Zhao
Zhijie Lin
Jingkuan Song
Xiaofei He
BDL
27
14
0
28 Jun 2019
Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation
Eva Vanmassenhove
D. Shterionov
Andy Way
22
91
0
28 Jun 2019
Relating Simple Sentence Representations in Deep Neural Networks and the Brain
Sharmistha Jat
Hao Tang
Partha P. Talukdar
Tom Michael Mitchell
22
21
0
27 Jun 2019
The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation
Mai Oudah
Amjad Almahairi
Nizar Habash
30
34
0
27 Jun 2019
Selection via Proxy: Efficient Data Selection for Deep Learning
Cody Coleman
Christopher Yeh
Stephen Mussmann
Baharan Mirzasoleiman
Peter Bailis
Percy Liang
J. Leskovec
Matei A. Zaharia
33
330
0
26 Jun 2019
Sharing Attention Weights for Fast Transformer
Tong Xiao
Yinqiao Li
Jingbo Zhu
Zhengtao Yu
Tongran Liu
17
50
0
26 Jun 2019
Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs
Soheil Kolouri
Aniruddha Saha
Hamed Pirsiavash
Heiko Hoffmann
AAML
30
231
0
26 Jun 2019
Program Synthesis and Semantic Parsing with Learned Code Idioms
Richard Shin
Miltiadis Allamanis
Marc Brockschmidt
Oleksandr Polozov
24
87
0
26 Jun 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
36
797
0
25 Jun 2019
Saliency-driven Word Alignment Interpretation for Neural Machine Translation
Shuoyang Ding
Hainan Xu
Philipp Koehn
33
55
0
25 Jun 2019
Training an Interactive Helper
Mark P. Woodward
Chelsea Finn
Karol Hausman
27
1
0
24 Jun 2019
Self Multi-Head Attention for Speaker Recognition
Miquel India
Pooyan Safari
Javier Hernando
19
110
0
24 Jun 2019
Adversarial Multimodal Network for Movie Question Answering
Zhaoquan Yuan
Siyuan Sun
Lixin Duan
Xiao Wu
Changsheng Xu
24
3
0
24 Jun 2019
Classification and Clustering of Arguments with Contextualized Word Embeddings
Nils Reimers
Benjamin Schiller
Tilman Beck
Johannes Daxenberger
Christian Stab
Iryna Gurevych
19
165
0
24 Jun 2019
Embedding Projection for Targeted Cross-Lingual Sentiment: Model Comparisons and a Real-World Study
Jeremy Barnes
Roman Klinger
27
15
0
24 Jun 2019
EQuANt (Enhanced Question Answer Network)
Franccois-Xavier Aubet
D. Danks
Yuchen Zhu
21
3
0
24 Jun 2019
Evaluating the Supervised and Zero-shot Performance of Multi-lingual Translation Models
Chris Hokamp
John Glover
D. Ghalandari
21
14
0
24 Jun 2019
Sequence Generation: From Both Sides to the Middle
Long Zhou
Jiajun Zhang
Chengqing Zong
Heng Yu
36
22
0
23 Jun 2019
Learning Belief Representations for Imitation Learning in POMDPs
Tanmay Gangwani
Joel Lehman
Qiang Liu
Jian Peng
24
36
0
22 Jun 2019
Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
Chenze Shao
Yang Feng
Jinchao Zhang
Fandong Meng
Xilin Chen
Jie Zhou
27
42
0
22 Jun 2019
Graph Star Net for Generalized Multi-Task Learning
H. Lu
Seth H. Huang
Tian Ye
Xiuyan Guo
GNN
33
46
0
21 Jun 2019
Improving Zero-shot Translation with Language-Independent Constraints
Ngoc-Quan Pham
Jan Niehues
Thanh-Le Ha
A. Waibel
31
60
0
20 Jun 2019
Evaluating Protein Transfer Learning with TAPE
Roshan Rao
Nicholas Bhattacharya
Neil Thomas
Yan Duan
Xi Chen
John F. Canny
Pieter Abbeel
Yun S. Song
SSL
56
782
0
19 Jun 2019
Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
Christoph Alt
Marc Hübner
Leonhard Hennig
15
119
0
19 Jun 2019
Distilling Translations with Visual Awareness
Julia Ive
Pranava Madhyastha
Lucia Specia
VLM
30
76
0
18 Jun 2019
Improving Sentiment Analysis with Multi-task Learning of Negation
Jeremy Barnes
Erik Velldal
Lilja Øvrelid
16
36
0
18 Jun 2019
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models
Wei Fang
Yu-An Chung
James R. Glass
18
27
0
17 Jun 2019
Generalizing Back-Translation in Neural Machine Translation
Miguel Graça
Yunsu Kim
Julian Schamper
Shahram Khadivi
Hermann Ney
17
48
0
17 Jun 2019
Benchmarking Neural Machine Translation for Southern African Languages
Laura Martinus
Jade Z. Abbott
11
18
0
17 Jun 2019
ParNet: Position-aware Aggregated Relation Network for Image-Text matching
Yaxian Xia
Lun Huang
Wenmin Wang
Xiao-Yong Wei
Jie Chen
27
1
0
17 Jun 2019
Meta-learning Pseudo-differential Operators with Deep Neural Networks
Jordi Feliu-Fabà
Yuwei Fan
Lexing Ying
22
39
0
16 Jun 2019
Deep Set Prediction Networks
Yan Zhang
Jonathon S. Hare
Adam Prugel-Bennett
22
108
0
15 Jun 2019
Tagged Back-Translation
Isaac Caswell
Ciprian Chelba
David Grangier
24
218
0
15 Jun 2019
Scalable Syntax-Aware Language Models Using Knowledge Distillation
A. Kuncoro
Chris Dyer
Laura Rimell
S. Clark
Phil Blunsom
35
26
0
14 Jun 2019
A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning
Gonçalo M. Correia
André F. T. Martins
16
42
0
14 Jun 2019
Attention-based Modeling for Emotion Detection and Classification in Textual Conversations
Waleed Ragheb
J. Azé
S. Bringay
Maximilien Servajean
24
25
0
14 Jun 2019
Improving Multi-turn Dialogue Modelling with Utterance ReWriter
Hui Su
Xiaoyu Shen
Rongzhi Zhang
Fei Sun
Pengwei Hu
Cheng Niu
Jie Zhou
KELM
15
108
0
14 Jun 2019
Image Captioning: Transforming Objects into Words
Simão Herdade
Armin Kappeler
K. Boakye
Joao Soares
ViT
45
462
0
14 Jun 2019
Learning Video Representations using Contrastive Bidirectional Transformer
Chen Sun
Fabien Baradel
Kevin Patrick Murphy
Cordelia Schmid
SSL
ViT
27
133
0
13 Jun 2019
2D Attentional Irregular Scene Text Recognizer
Pengyuan Lyu
Zhicheng Yang
Xinhang Leng
Xiaojun Wu
Ruiyu Li
Xiaoyong Shen
3DV
36
50
0
13 Jun 2019
Lattice Transformer for Speech Translation
Pei Zhang
Boxing Chen
Niyu Ge
Kai Fan
37
48
0
13 Jun 2019
Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets
Yifan Peng
Shankai Yan
Zhiyong Lu
LM&MA
AI4MH
15
830
0
13 Jun 2019
A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics
Aditya Joshi
Sarvnaz Karimi
R. Sparks
Cécile Paris
C. Macintyre
22
14
0
13 Jun 2019
Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training
William Harvey
Michael Teng
Frank Wood
31
4
0
13 Jun 2019
Neural Arabic Question Answering
Hussein Mozannar
Karl El Hajal
Elie Maamary
Hazem M. Hajj
18
134
0
12 Jun 2019
Previous
1
2
3
...
358
359
360
...
370
371
372
Next