ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.00592
  4. Cited By
Assessing the Ability of Self-Attention Networks to Learn Word Order

Assessing the Ability of Self-Attention Networks to Learn Word Order

3 June 2019
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
ArXivPDFHTML

Papers citing "Assessing the Ability of Self-Attention Networks to Learn Word Order"

12 / 12 papers shown
Title
AST-MHSA : Code Summarization using Multi-Head Self-Attention
AST-MHSA : Code Summarization using Multi-Head Self-Attention
Y. Nagaraj
U. Gupta
25
1
0
10 Aug 2023
Position Information in Transformers: An Overview
Position Information in Transformers: An Overview
Philipp Dufter
Martin Schmitt
Hinrich Schütze
13
139
0
22 Feb 2021
Mitigating the Position Bias of Transformer Models in Passage Re-Ranking
Mitigating the Position Bias of Transformer Models in Passage Re-Ranking
Sebastian Hofstatter
Aldo Lipani
Sophia Althammer
Markus Zlabinger
Allan Hanbury
55
15
0
18 Jan 2021
Rethinking the Value of Transformer Components
Rethinking the Value of Transformer Components
Wenxuan Wang
Zhaopeng Tu
11
38
0
07 Nov 2020
On the Sub-Layer Functionalities of Transformer Decoder
On the Sub-Layer Functionalities of Transformer Decoder
Yilin Yang
Longyue Wang
Shuming Shi
Prasad Tadepalli
Stefan Lee
Zhaopeng Tu
26
27
0
06 Oct 2020
On the Computational Power of Transformers and its Implications in
  Sequence Modeling
On the Computational Power of Transformers and its Implications in Sequence Modeling
S. Bhattamishra
Arkil Patel
Navin Goyal
33
65
0
16 Jun 2020
How Does Selective Mechanism Improve Self-Attention Networks?
How Does Selective Mechanism Improve Self-Attention Networks?
Xinwei Geng
Longyue Wang
Xing Wang
Bing Qin
Ting Liu
Zhaopeng Tu
AAML
39
35
0
03 May 2020
Self-Attention with Cross-Lingual Position Representation
Self-Attention with Cross-Lingual Position Representation
Liang Ding
Longyue Wang
Dacheng Tao
MILM
33
37
0
28 Apr 2020
Towards Understanding Neural Machine Translation with Word Importance
Towards Understanding Neural Machine Translation with Word Importance
Shilin He
Zhaopeng Tu
Xing Wang
Longyue Wang
Michael R. Lyu
Shuming Shi
AAML
23
39
0
01 Sep 2019
What you can cram into a single vector: Probing sentence embeddings for
  linguistic properties
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
883
0
03 May 2018
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
213
1,367
0
06 Jun 2016
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,929
0
17 Aug 2015
1