ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.03762
  4. Cited By
Attention Is All You Need
v1v2v3v4v5v6v7 (latest)

Attention Is All You Need

12 June 2017
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
    3DV
ArXiv (abs)PDFHTML

Papers citing "Attention Is All You Need"

50 / 27,337 papers shown
Title
Seeking an Optimal Approach for Computer-Aided Pulmonary Embolism
  Detection
Seeking an Optimal Approach for Computer-Aided Pulmonary Embolism Detection
N. Islam
S. Gehlot
Zongwei Zhou
Michael B. Gotway
Jianming Liang
OOD
156
11
0
15 Sep 2021
A Three Step Training Approach with Data Augmentation for Morphological
  Inflection
A Three Step Training Approach with Data Augmentation for Morphological Inflection
Gábor Szolnok
Botond Barta
Dorina Lakatos
Judit Ács
44
1
0
14 Sep 2021
Explainable Identification of Dementia from Transcripts using
  Transformer Networks
Explainable Identification of Dementia from Transcripts using Transformer Networks
Loukas Ilias
D. Askounis
77
39
0
14 Sep 2021
Automatically Exposing Problems with Neural Dialog Models
Automatically Exposing Problems with Neural Dialog Models
Dian Yu
Kenji Sagae
110
9
0
14 Sep 2021
The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with
  Transformer Encoders
The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders
Han He
Jinho Choi
113
96
0
14 Sep 2021
On the Language-specificity of Multilingual BERT and the Impact of
  Fine-tuning
On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning
Marc Tanti
Lonneke van der Plas
Claudia Borg
Albert Gatt
80
11
0
14 Sep 2021
Performance-Efficiency Trade-offs in Unsupervised Pre-training for
  Speech Recognition
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu
Kwangyoun Kim
Jing Pan
Kyu Jeong Han
Kilian Q. Weinberger
Yoav Artzi
60
75
0
14 Sep 2021
LM-Critic: Language Models for Unsupervised Grammatical Error Correction
LM-Critic: Language Models for Unsupervised Grammatical Error Correction
Michihiro Yasunaga
J. Leskovec
Percy Liang
84
50
0
14 Sep 2021
Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in
  the UMLS Metathesaurus
Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in the UMLS Metathesaurus
Goonmeet Bajaj
Vinh Phu Nguyen
Thilini Wijesiriwardene
H. Y. Yip
Vishesh Javangula
Srinivasan Parthasarathy
A. Sheth
O. Bodenreider
22
1
0
14 Sep 2021
A Temporal Variational Model for Story Generation
A Temporal Variational Model for Story Generation
David Wilmot
Frank Keller
DRL
110
9
0
14 Sep 2021
Controllable Dialogue Generation with Disentangled Multi-grained Style
  Specification and Attribute Consistency Reward
Controllable Dialogue Generation with Disentangled Multi-grained Style Specification and Attribute Consistency Reward
Zhe Hu
Zhiwei Cao
Hou Pong Chan
Jiachen Liu
Xinyan Xiao
Jinsong Su
Hua Wu
72
11
0
14 Sep 2021
Non-autoregressive Transformer with Unified Bidirectional Decoder for
  Automatic Speech Recognition
Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition
Chuan-Fei Zhang
Yang Liu
Tianren Zhang
Songlu Chen
Feng Chen
Xu-Cheng Yin
51
8
0
14 Sep 2021
Exploration in Deep Reinforcement Learning: From Single-Agent to
  Multiagent Domain
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Jianye Hao
Tianpei Yang
Hongyao Tang
Chenjia Bai
Jinyi Liu
Zhaopeng Meng
Peng Liu
Zhen Wang
OffRL
86
102
0
14 Sep 2021
Expert Knowledge-Guided Length-Variant Hierarchical Label Generation for
  Proposal Classification
Expert Knowledge-Guided Length-Variant Hierarchical Label Generation for Proposal Classification
Meng Xiao
Ziyue Qiao
Yanjie Fu
Yi Du
Pengyang Wang
VLM
89
9
0
14 Sep 2021
An MRC Framework for Semantic Role Labeling
An MRC Framework for Semantic Role Labeling
Nan Wang
Jiwei Li
Yuxian Meng
Xiaofei Sun
Han Qiu
Ziyao Wang
Guoyin Wang
Jun He
58
7
0
14 Sep 2021
Dynamic Attentive Graph Learning for Image Restoration
Dynamic Attentive Graph Learning for Image Restoration
Chong Mou
Jian Zhang
Zhuoyuan Wu
DiffM
115
83
0
14 Sep 2021
Non-Parametric Unsupervised Domain Adaptation for Neural Machine
  Translation
Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation
Xin Zheng
Zhirui Zhang
Shujian Huang
Boxing Chen
Jun Xie
Weihua Luo
Jiajun Chen
121
25
0
14 Sep 2021
Sum-Product-Attention Networks: Leveraging Self-Attention in
  Probabilistic Circuits
Sum-Product-Attention Networks: Leveraging Self-Attention in Probabilistic Circuits
Zhongjie Yu
Devendra Singh Dhami
Kristian Kersting
TPM3DPCLRM
30
0
0
14 Sep 2021
Continuous Homeostatic Reinforcement Learning for Self-Regulated
  Autonomous Agents
Continuous Homeostatic Reinforcement Learning for Self-Regulated Autonomous Agents
Hugo Laurençon
Charbel-Raphaël Ségerie
J. Lussange
Boris Gutkin
72
7
0
14 Sep 2021
Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale
  Transformer
Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer
Fushun Zhu
Shan Zhao
Peng Wang
Hao Wang
Hua Yan
Shuaicheng Liu
ViT
61
16
0
14 Sep 2021
AligNART: Non-autoregressive Neural Machine Translation by Jointly
  Learning to Estimate Alignment and Translate
AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
Jongyoon Song
Sungwon Kim
Sungroh Yoon
114
40
0
14 Sep 2021
Logic-level Evidence Retrieval and Graph-based Verification Network for
  Table-based Fact Verification
Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification
Qi Shi
Yu Zhang
Qingyu Yin
Ting Liu
117
19
0
14 Sep 2021
Identifying Untrustworthy Samples: Data Filtering for Open-domain
  Dialogues with Bayesian Optimization
Identifying Untrustworthy Samples: Data Filtering for Open-domain Dialogues with Bayesian Optimization
Lei Shen
Haolan Zhan
Xin Shen
Hongshen Chen
Xiaofang Zhao
Xiao-Dan Zhu
83
17
0
14 Sep 2021
Tesla-Rapture: A Lightweight Gesture Recognition System from mmWave
  Radar Point Clouds
Tesla-Rapture: A Lightweight Gesture Recognition System from mmWave Radar Point Clouds
Dariush Salami
Ramin Hasibi
Sameera Palipana
P. Popovski
T. Michoel
S. Sigg
25
49
0
14 Sep 2021
Multi-modal Motion Prediction with Transformer-based Neural Network for
  Autonomous Driving
Multi-modal Motion Prediction with Transformer-based Neural Network for Autonomous Driving
Zhiyu Huang
Xiaoyu Mo
Chen Lv
161
118
0
14 Sep 2021
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
Xueyao Zhang
Jinchao Zhang
Yao Qiu
Li Wang
Jie Zhou
93
26
0
14 Sep 2021
Exploring the Long Short-Term Dependencies to Infer Shot Influence in
  Badminton Matches
Exploring the Long Short-Term Dependencies to Infer Shot Influence in Badminton Matches
Wei-Yao Wang
Teng-Fong Chan
Hui-Kuo Yang
Chih-Chuan Wang
Yao-Chung Fan
Wen-Chih Peng
63
20
0
14 Sep 2021
MMCoVaR: Multimodal COVID-19 Vaccine Focused Data Repository for Fake
  News Detection and a Baseline Architecture for Classification
MMCoVaR: Multimodal COVID-19 Vaccine Focused Data Repository for Fake News Detection and a Baseline Architecture for Classification
Mingxuan Chen
Xinqiao Chu
K. P. Subbalakshmi
94
29
0
14 Sep 2021
Progressively Guide to Attend: An Iterative Alignment Framework for
  Temporal Sentence Grounding
Progressively Guide to Attend: An Iterative Alignment Framework for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Pan Zhou
87
46
0
14 Sep 2021
Adaptive Proposal Generation Network for Temporal Sentence Localization
  in Videos
Adaptive Proposal Generation Network for Temporal Sentence Localization in Videos
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
85
55
0
14 Sep 2021
Evaluating Transferability of BERT Models on Uralic Languages
Evaluating Transferability of BERT Models on Uralic Languages
Judit Ács
Dániel Lévai
András Kornai
63
6
0
13 Sep 2021
Improving Scheduled Sampling with Elastic Weight Consolidation for
  Neural Machine Translation
Improving Scheduled Sampling with Elastic Weight Consolidation for Neural Machine Translation
Michalis Korakakis
Andreas Vlachos
CLL
58
2
0
13 Sep 2021
MindCraft: Theory of Mind Modeling for Situated Dialogue in
  Collaborative Tasks
MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks
Cristian-Paul Bara
Sky CH-Wang
J. Chai
120
64
0
13 Sep 2021
Post-OCR Document Correction with large Ensembles of Character
  Sequence-to-Sequence Models
Post-OCR Document Correction with large Ensembles of Character Sequence-to-Sequence Models
Juan Ramirez-Orta
Eduardo Xamena
Ana Gabriela Maguitman
E. Milios
Axel J. Soto
3DV
50
15
0
13 Sep 2021
Evaluating Multiway Multilingual NMT in the Turkic Languages
Evaluating Multiway Multilingual NMT in the Turkic Languages
Jamshidbek Mirzakhalov
A. Babu
Aigiz Kunafin
Ahsan Wahab
Behzodbek Moydinboyev
...
Julia Kreutzer
Francis M. Tyers
Orhan Firat
John Licato
Sriram Chellappan
ELM
61
9
0
13 Sep 2021
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation
Tongkun Xu
Weihua Chen
Pichao Wang
Fan Wang
Hao Li
Rong Jin
ViT
163
221
0
13 Sep 2021
SPARQLing Database Queries from Intermediate Question Decompositions
SPARQLing Database Queries from Intermediate Question Decompositions
Irina Saparina
A. Osokin
76
14
0
13 Sep 2021
Neuro-Symbolic AI: An Emerging Class of AI Workloads and their
  Characterization
Neuro-Symbolic AI: An Emerging Class of AI Workloads and their Characterization
Zachary Susskind
Bryce Arden
L. John
Patrick A Stockton
E. John
NAI
52
41
0
13 Sep 2021
Beyond Isolated Utterances: Conversational Emotion Recognition
Beyond Isolated Utterances: Conversational Emotion Recognition
R. Pappagari
Piotr Żelasko
Jesús Villalba
Laureano Moro-Velazquez
Najim Dehak
52
4
0
13 Sep 2021
On Pursuit of Designing Multi-modal Transformer for Video Grounding
On Pursuit of Designing Multi-modal Transformer for Video Grounding
Meng Cao
Long Chen
Mike Zheng Shou
Can Zhang
Yuexian Zou
83
81
0
13 Sep 2021
xGQA: Cross-Lingual Visual Question Answering
xGQA: Cross-Lingual Visual Question Answering
Jonas Pfeiffer
Gregor Geigle
Aishwarya Kamath
Jan-Martin O. Steitz
Stefan Roth
Ivan Vulić
Iryna Gurevych
115
62
0
13 Sep 2021
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extraction
Deming Ye
Yankai Lin
Peng Li
Maosong Sun
210
112
0
13 Sep 2021
Traffic Event Detection as a Slot Filling Problem
Traffic Event Detection as a Slot Filling Problem
Xiangyu Yang
Giannis Bekoulis
Nikos Deligiannis
100
6
0
13 Sep 2021
Not All Models Localize Linguistic Knowledge in the Same Place: A
  Layer-wise Probing on BERToids' Representations
Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations
Mohsen Fayyaz
Ehsan Aghazadeh
Ali Modarressi
Hosein Mohebbi
Mohammad Taher Pilehvar
35
21
0
13 Sep 2021
r-GAT: Relational Graph Attention Network for Multi-Relational Graphs
r-GAT: Relational Graph Attention Network for Multi-Relational Graphs
Meiqi Chen
Yuan Zhang
Xiaoyu Kou
Yuntao Li
Yan Zhang
GNN
53
16
0
13 Sep 2021
Question Answering over Electronic Devices: A New Benchmark Dataset and
  a Multi-Task Learning based QA Framework
Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework
Abhilash Nandy
Soumya Sharma
Shubham Maddhashiya
K. Sachdeva
Pawan Goyal
Niloy Ganguly
68
19
0
13 Sep 2021
Process Discovery Using Graph Neural Networks
Process Discovery Using Graph Neural Networks
Dominique Sommers
Vlado Menkovski
Dirk Fahland
GNN
24
15
0
13 Sep 2021
Show Me How To Revise: Improving Lexically Constrained Sentence
  Generation with XLNet
Show Me How To Revise: Improving Lexically Constrained Sentence Generation with XLNet
Xingwei He
Victor O.K. Li
BDL
282
24
0
13 Sep 2021
CEM: Commonsense-aware Empathetic Response Generation
CEM: Commonsense-aware Empathetic Response Generation
Sahand Sabour
Chujie Zheng
Minlie Huang
92
152
0
13 Sep 2021
Region Invariant Normalizing Flows for Mobility Transfer
Region Invariant Normalizing Flows for Mobility Transfer
Vinayak Gupta
Srikanta J. Bedathur
94
12
0
13 Sep 2021
Previous
123...360361362...545546547
Next