ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.09586
  4. Cited By
Probing Word Translations in the Transformer and Trading Decoder for
  Encoder Layers

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers

21 March 2020
Hongfei Xu
Josef van Genabith
Qiuhui Liu
Deyi Xiong
ArXivPDFHTML

Papers citing "Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers"

9 / 9 papers shown
Title
The Bottom-up Evolution of Representations in the Transformer: A Study
  with Machine Translation and Language Modeling Objectives
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita
Rico Sennrich
Ivan Titov
247
185
0
03 Sep 2019
Transformer Dissection: A Unified Understanding of Transformer's
  Attention via the Lens of Kernel
Transformer Dissection: A Unified Understanding of Transformer's Attention via the Lens of Kernel
Yao-Hung Hubert Tsai
Shaojie Bai
M. Yamada
Louis-Philippe Morency
Ruslan Salakhutdinov
91
251
0
30 Aug 2019
Retrieving Sequential Information for Non-Autoregressive Neural Machine
  Translation
Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
Chenze Shao
Yang Feng
Jinchao Zhang
Fandong Meng
Xilin Chen
Jie Zhou
46
42
0
22 Jun 2019
Assessing the Ability of Self-Attention Networks to Learn Word Order
Assessing the Ability of Self-Attention Networks to Learn Word Order
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
25
32
0
03 Jun 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy
  Lifting, the Rest Can Be Pruned
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
76
1,120
0
23 May 2019
BERT Rediscovers the Classical NLP Pipeline
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
100
1,458
0
15 May 2019
What you can cram into a single vector: Probing sentence embeddings for
  linguistic properties
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
272
888
0
03 May 2018
Accelerating Neural Transformer via an Average Attention Network
Accelerating Neural Transformer via an Average Attention Network
Biao Zhang
Deyi Xiong
Jinsong Su
45
120
0
02 May 2018
Neural Machine Translation of Rare Words with Subword Units
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
151
7,683
0
31 Aug 2015
1