Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03762
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Attention Is All You Need
12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attention Is All You Need"
50 / 27,337 papers shown
Title
Language Models as a Knowledge Source for Cognitive Agents
R. Wray
James R. Kirk
John E. Laird
57
15
0
17 Sep 2021
Sparse Factorization of Large Square Matrices
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
26
2
0
16 Sep 2021
An End-to-End Transformer Model for 3D Object Detection
Ishan Misra
Rohit Girdhar
Armand Joulin
3DPC
ViT
138
487
0
16 Sep 2021
A Survey on Temporal Sentence Grounding in Videos
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
123
47
0
16 Sep 2021
The NiuTrans System for WNGT 2020 Efficiency Task
Chi Hu
Bei Li
Ye Lin
Yinqiao Li
Yanyang Li
Chenglong Wang
Tong Xiao
Jingbo Zhu
33
7
0
16 Sep 2021
Alquist 4.0: Towards Social Intelligence Using Generative Models and Dialogue Personalization
Jakub Konrád
Jan Pichl
Petro Marek
Petr Lorenc
Van Duy Ta
Ondrej Kobza
L. Hýlová
Jan Sedivý
76
17
0
16 Sep 2021
Overview of Tencent Multi-modal Ads Video Understanding Challenge
Zhenzhi Wang
Liyu Wu
Zhimin Li
Jiangfeng Xiong
Qinglin Lu
58
4
0
16 Sep 2021
RetrievalSum: A Retrieval Enhanced Framework for Abstractive Summarization
Chen An
Ming Zhong
Zhichao Geng
Jianqiang Yang
Xipeng Qiu
RALM
74
25
0
16 Sep 2021
MFE-NER: Multi-feature Fusion Embedding for Chinese Named Entity Recognition
Jiatong Li
Kui Meng
47
16
0
16 Sep 2021
Translation Transformers Rediscover Inherent Data Domains
Maksym Del
Elizaveta Korotkova
Mark Fishel
46
7
0
16 Sep 2021
Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning
Shikha Dubey
Farrukh Olimov
M. Rafique
Joonmo Kim
M. Jeon
ViT
82
42
0
16 Sep 2021
Improving Neural Machine Translation by Bidirectional Training
Liang Ding
Di Wu
Dacheng Tao
77
30
0
16 Sep 2021
Constructing Emotion Consensus and Utilizing Unpaired Data for Empathetic Dialogue Generation
Lei Shen
Jinchao Zhang
Jiao Ou
Xiaofang Zhao
Jie Zhou
58
6
0
16 Sep 2021
Utterance-level neural confidence measure for end-to-end children speech recognition
W. Liu
Tan Lee
51
5
0
16 Sep 2021
Scaling Laws for Neural Machine Translation
Behrooz Ghorbani
Orhan Firat
Markus Freitag
Ankur Bapna
M. Krikun
Xavier Garcia
Ciprian Chelba
Colin Cherry
90
103
0
16 Sep 2021
Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning
Sumeet Batra
Zhehui Huang
Aleksei Petrenko
T. Kumar
Artem Molchanov
Gaurav Sukhatme
73
47
0
16 Sep 2021
Few-Shot Object Detection by Attending to Per-Sample-Prototype
Hojun Lee
Myunggi Lee
Nojun Kwak
ObjD
98
32
0
16 Sep 2021
Decoupling Long- and Short-Term Patterns in Spatiotemporal Inference
Junfeng Hu
Yuxuan Liang
Zhencheng Fan
Ying Zhang
Yifang Yin
Roger Zimmermann
AI4TS
116
11
0
16 Sep 2021
OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication
Runsheng Xu
Hao Xiang
Xin Xia
Xu Han
Jinlong Liu
Jiaqi Ma
113
378
0
16 Sep 2021
Towards Zero-shot Cross-lingual Image Retrieval and Tagging
Pranav Aggarwal
Ritiz Tambi
Ajinkya Kale
VLM
89
6
0
15 Sep 2021
On the Complementarity of Data Selection and Fine Tuning for Domain Adaptation
Dan Iter
David Grangier
92
10
0
15 Sep 2021
Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers
Ángel Martínez-González
M. Villamizar
J. Odobez
ViT
55
75
0
15 Sep 2021
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
83
55
0
15 Sep 2021
Short Quantum Circuits in Reinforcement Learning Policies for the Vehicle Routing Problem
F. Sanches
Sean J. Weinberg
Takanori Ide
Kazumitsu Kamiya
85
10
0
15 Sep 2021
On the Limits of Minimal Pairs in Contrastive Evaluation
Jannis Vamvas
Rico Sennrich
88
16
0
15 Sep 2021
Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering
Youngjoon Kwon
Dahun Kim
Duygu Ceylan
Henry Fuchs
3DH
146
170
0
15 Sep 2021
When Does Translation Require Context? A Data-driven, Multilingual Exploration
Patrick Fernandes
Kayo Yin
Emmy Liu
André F. T. Martins
Graham Neubig
54
36
0
15 Sep 2021
CAMul: Calibrated and Accurate Multi-view Time-Series Forecasting
Harshavardhan Kamarthi
Lingkai Kong
Alexander Rodríguez
Chao Zhang
B. Prakash
AI4TS
110
17
0
15 Sep 2021
BERT is Robust! A Case Against Synonym-Based Adversarial Examples in Text Classification
J. Hauser
Zhao Meng
Damian Pascual
Roger Wattenhofer
OOD
SILM
AAML
252
14
0
15 Sep 2021
Multi View Spatial-Temporal Model for Travel Time Estimation
Zichuan Liu
Zhaoyang Wu
Meng Wang
Rui Zhang
38
6
0
15 Sep 2021
Matching with Transformers in MELT
S. Hertling
Jan Portisch
Heiko Paulheim
50
9
0
15 Sep 2021
Learning When to Translate for Streaming Speech
Qianqian Dong
Yaoming Zhu
Mingxuan Wang
Lei Li
100
30
0
15 Sep 2021
The ELITR ECA Corpus
Philip Williams
Barry Haddow
33
4
0
15 Sep 2021
Miðeind's WMT 2021 submission
Haukur Barri Símonarson
Vésteinn Snæbjarnarson
Pétur Orri Ragnarsson
Haukur Páll Jónsson
Vilhjálmur Þorsteinsson
VLM
49
11
0
15 Sep 2021
Prefix-to-SQL: Text-to-SQL Generation from Incomplete User Questions
Naihao Deng
Shuaichen Chang
Peng Shi
Tao Yu
Rui Zhang
LMTD
64
4
0
15 Sep 2021
Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
Bo Zheng
Li Dong
Shaohan Huang
Saksham Singhal
Wanxiang Che
Ting Liu
Xia Song
Furu Wei
VLM
82
22
0
15 Sep 2021
Sequence Length is a Domain: Length-based Overfitting in Transformer Models
Dusan Varis
Ondrej Bojar
73
56
0
15 Sep 2021
Distract Your Attention: Multi-head Cross Attention Network for Facial Expression Recognition
Zhengyao Wen
Wen-Long Lin
Tao Wang
Ge Xu
CVBM
188
219
0
15 Sep 2021
Integrating Sensing and Communication in Cellular Networks via NR Sidelink
Dariush Salami
Ramin Hasibi
S. Savazzi
T. Michoel
S. Sigg
55
3
0
15 Sep 2021
Enhancing Clinical Information Extraction with Transferred Contextual Embeddings
Zimin Wan
Chenchen Xu
H. Suominen
47
0
0
15 Sep 2021
Dialog speech sentiment classification for imbalanced datasets
Sergis Nicolaou
Lambros Mavrides
G. Tryfou
Kyriakos Tolias
Konstantinos P. Panousis
S. Chatzis
Sergios Theodoridis
36
0
0
15 Sep 2021
MISSFormer: An Effective Medical Image Segmentation Transformer
Xiaohong Huang
Zhifang Deng
Dandan Li
Xueguang Yuan
ViT
MedIm
163
186
0
15 Sep 2021
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
240
49
0
15 Sep 2021
Beyond Glass-Box Features: Uncertainty Quantification Enhanced Quality Estimation for Neural Machine Translation
Ke Min Wang
Yangbin Shi
Jiayi Wang
Yuqi Zhang
Yu Zhao
Xiaolin Zheng
75
6
0
15 Sep 2021
Anchor DETR: Query Design for Transformer-Based Object Detection
Yingming Wang
Xinming Zhang
Tong Yang
Jian Sun
ViT
65
54
0
15 Sep 2021
Towards Document-Level Paraphrase Generation with Sentence Rewriting and Reordering
Zhe Lin
Yitao Cai
Xiaojun Wan
74
14
0
15 Sep 2021
Transformer-based Lexically Constrained Headline Generation
Kosuke Yamada
Yuta Hitomi
Hideaki Tamori
Ryohei Sasano
Naoaki Okazaki
Kentaro Inui
Koichi Takeda
78
12
0
15 Sep 2021
Self-Training with Differentiable Teacher
Simiao Zuo
Yue Yu
Chen Liang
Haoming Jiang
Siawpeng Er
Chao Zhang
T. Zhao
H. Zha
89
14
0
15 Sep 2021
Attention Is Indeed All You Need: Semantically Attention-Guided Decoding for Data-to-Text NLG
Juraj Juraska
M. Walker
56
17
0
15 Sep 2021
PnP-DETR: Towards Efficient Visual Analysis with Transformers
Tao Wang
Li Yuan
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
ViT
68
88
0
15 Sep 2021
Previous
1
2
3
...
359
360
361
...
545
546
547
Next