Attention Is All You Need

12 June 2017

Papers citing "Attention Is All You Need"

50 / 18,593 papers shown

Title
Action Anticipation for Collaborative Environments: The Impact of Contextual Information and Uncertainty-Based Prediction Clebeson Canuto dos Santos Plinio Moreno J. L. A. Samatelo R. Vassallo J. Santos-Victor 25 7 0 01 Oct 2019
A Computationally Efficient Pipeline Approach to Full Page Offline Handwritten Text Recognition Jonathan Chung Thomas Delteil 24 31 0 01 Oct 2019
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum Jianyu Wang Vinayak Tantia Nicolas Ballas Michael G. Rabbat 12 200 0 01 Oct 2019
Predicting materials properties without crystal structure: Deep representation learning from stoichiometry Rhys E. A. Goodall A. Lee 21 254 0 01 Oct 2019
Dialogue Transformers Vladimir Vlasov Johannes E. M. Mosig Alan Nichol 27 56 0 01 Oct 2019
Grammatical Error Correction in Low-Resource Scenarios Jakub Náplava Milan Straka 13 55 0 01 Oct 2019
When and Why is Document-level Context Useful in Neural Machine Translation? Yunsu Kim Thanh-Hai Tran Hermann Ney 19 84 0 01 Oct 2019
Multilingual End-to-End Speech Translation Hirofumi Inaguma Kevin Duh Tatsuya Kawahara Shinji Watanabe LRM 28 86 0 01 Oct 2019
Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations Christian Hadiwinoto Hwee Tou Ng Wee Chung Gan 22 83 0 01 Oct 2019
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing Kevin Duarte Yogesh S Rawat M. Shah VOS 16 68 0 30 Sep 2019
Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations Po-Yao (Bernie) Huang Xiaojun Chang Alexander G. Hauptmann 30 25 0 30 Sep 2019
Revisiting Self-Training for Neural Sequence Generation Junxian He Jiatao Gu Jiajun Shen MarcÁurelio Ranzato SSL LRM 244 270 0 30 Sep 2019
A Closer Look at Data Bias in Neural Extractive Summarization Models Ming Zhong Danqing Wang Pengfei Liu Xipeng Qiu Xuanjing Huang 48 42 0 30 Sep 2019
Chameleon: Learning Model Initializations Across Tasks With Different Schemas L. Brinkmeyer Rafael Rêgo Drumond Randolf Scholz Josif Grabocka Lars Schmidt-Thieme CLL 19 8 0 30 Sep 2019
Lane Attention: Predicting Vehicles' Moving Trajectories by Learning Their Attention over Lanes Jiacheng Pan Hongyi Sun Kecheng Xu Yifei Jiang Xiangquan Xiao Jiangtao Hu Jinghao Miao 19 35 0 29 Sep 2019
How to Evaluate Machine Learning Approaches for Combinatorial Optimization: Application to the Travelling Salesman Problem Antoine François Quentin Cappart Louis-Martin Rousseau 22 13 0 28 Sep 2019
Self-Attention Transducers for End-to-End Speech Recognition Zhengkun Tian Jiangyan Yi J. Tao Ye Bai Zhengqi Wen AI4TS 29 70 0 28 Sep 2019
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval Reuben Tan Huijuan Xu Kate Saenko Bryan A. Plummer 28 67 0 27 Sep 2019
On the use of BERT for Neural Machine Translation S. Clinchant K. Jung Vassilina Nikoulina 27 89 0 27 Sep 2019
A Constructive Prediction of the Generalization Error Across Scales Jonathan S. Rosenfeld Amir Rosenfeld Yonatan Belinkov Nir Shavit 36 207 0 27 Sep 2019
Multi-Agent Actor-Critic with Hierarchical Graph Attention Network Heechang Ryu Hayong Shin Jinkyoo Park 25 115 0 27 Sep 2019
Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture Ashok Thillaisundaram Theodosia Togia 24 17 0 26 Sep 2019
Monotonic Multihead Attention Xutai Ma J. Pino James Cross Liezl Puzon Jiatao Gu 30 137 0 26 Sep 2019
Set Functions for Time Series Max Horn Michael Moor Christian Bock Bastian Alexander Rieck Karsten M. Borgwardt AI4TS 38 146 0 26 Sep 2019
Towards Understanding the Transferability of Deep Representations Hong Liu Mingsheng Long Jianmin Wang Michael I. Jordan 30 25 0 26 Sep 2019
Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation Ze Yang Can Xu Wei Wu Zhoujun Li 3DV 23 29 0 26 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Zhenzhong Lan Mingda Chen Sebastian Goodman Kevin Gimpel Piyush Sharma Radu Soricut SSL AIMat 112 6,380 0 26 Sep 2019
A Refined Equilibrium Generative Adversarial Network for Retinal Vessel Segmentation Yukun Zhou Zailiang Chen Hai-lan Shen Xianxian Zheng Rongchang Zhao Xuanchu Duan GAN MedIm 22 48 0 26 Sep 2019
Universal Graph Transformer Self-Attention Networks Dai Quoc Nguyen T. Nguyen Dinh Q. Phung ViT 34 63 0 26 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout Angela Fan Edouard Grave Armand Joulin 43 584 0 25 Sep 2019
Gated Channel Transformation for Visual Recognition Zongxin Yang Linchao Zhu Yu Wu Yezhou Yang ViT 22 203 0 25 Sep 2019
Synthetic Data for Deep Learning Sergey I. Nikolenko 46 348 0 25 Sep 2019
EEG-Based Driver Drowsiness Estimation Using Feature Weighted Episodic Training Yuqi Cui Yifan Xu Dongrui Wu 19 62 0 25 Sep 2019
A Survey of Binary Code Similarity I. Haq Juan Caballero 16 134 0 25 Sep 2019
Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization Ruyi Ji Longyin Wen Libo Zhang Dawei Du Ynajun Wu Chen Zhao Xianglong Liu Feiyue Huang 26 163 0 25 Sep 2019
Tackling Long-Tailed Relations and Uncommon Entities in Knowledge Graph Completion Zihao Wang K. Lai Piji Li Lidong Bing W. Lam 19 32 0 25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models Cheolhyoung Lee Kyunghyun Cho Wanmo Kang MoE 249 208 0 25 Sep 2019
TalkDown: A Corpus for Condescension Detection in Context Zijian Wang Christopher Potts 16 51 0 25 Sep 2019
Improving Noise Robustness In Speaker Identification Using A Two-Stage Attention Model Yanpei Shi Qiang Huang Thomas Hain 30 1 0 24 Sep 2019
Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation Yuhui Yuan Xiaokang Chen Xilin Chen Jingdong Wang ViT 49 1,403 0 24 Sep 2019
Unified Vision-Language Pre-Training for Image Captioning and VQA Luowei Zhou Hamid Palangi Lei Zhang Houdong Hu Jason J. Corso Jianfeng Gao MLLM VLM 252 927 0 24 Sep 2019
Talk2Car: Taking Control of Your Self-Driving Car Thierry Deruyttere Simon Vandenhende Dusan Grujicic Luc Van Gool Marie-Francine Moens LM&Ro 31 124 0 24 Sep 2019
6D Pose Estimation with Correlation Fusion Yi Cheng Erik Cambria Ying Sun C. Acar Wei Jing Yan Wu Liyuan Li Cheston Tan Joo-Hwee Lim 45 15 0 24 Sep 2019
Knowledge-Enriched Transformer for Emotion Detection in Textual Conversations Peixiang Zhong Di Wang Chunyan Miao 24 269 0 24 Sep 2019
Cross-Lingual Natural Language Generation via Pre-Training Zewen Chi Li Dong Furu Wei Wenhui Wang Xian-Ling Mao Heyan Huang 27 136 0 23 Sep 2019
On Model Stability as a Function of Random Seed Pranava Madhyastha Dhruv Batra 45 62 0 23 Sep 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings Gregor Wiedemann Steffen Remus Avi Chawla Chris Biemann 27 174 0 23 Sep 2019
Self-attention based end-to-end Hindi-English Neural Machine Translation Siddhant Srivastava Ritu Tiwari 9 2 0 21 Sep 2019
Scale MLPerf-0.6 models on Google TPU-v3 Pods Sameer Kumar Victor Bitorff Dehao Chen Chi-Heng Chou Blake A. Hechtman ... Peter Mattson Shibo Wang Tao Wang Yuanzhong Xu Zongwei Zhou 10 39 0 21 Sep 2019
Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages Yunsu Kim P. Petrov Pavel Petrushkov Shahram Khadivi Hermann Ney LRM 50 81 0 20 Sep 2019