ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01787
  4. Cited By
Learning Deep Transformer Models for Machine Translation

Learning Deep Transformer Models for Machine Translation

5 June 2019
Qiang Wang
Bei Li
Tong Xiao
Jingbo Zhu
Changliang Li
Derek F. Wong
Lidia S. Chao
ArXiv (abs)PDFHTML

Papers citing "Learning Deep Transformer Models for Machine Translation"

50 / 344 papers shown
Title
Gender Bias Amplification During Speed-Quality Optimization in Neural
  Machine Translation
Gender Bias Amplification During Speed-Quality Optimization in Neural Machine Translation
Adithya Renduchintala
Denise Díaz
Kenneth Heafield
Xian Li
Mona T. Diab
79
41
0
01 Jun 2021
Fast Nearest Neighbor Machine Translation
Fast Nearest Neighbor Machine Translation
Yuxian Meng
Xiaoya Li
Xiayu Zheng
Leilei Gan
Xiaofei Sun
Tianwei Zhang
Jiwei Li
LRM
82
49
0
30 May 2021
Transformer-Based Source-Free Domain Adaptation
Transformer-Based Source-Free Domain Adaptation
Guanglei Yang
Hao Tang
Zhun Zhong
M. Ding
Ling Shao
N. Sebe
Elisa Ricci
ViT
82
42
0
28 May 2021
Contrastive Learning for Many-to-many Multilingual Neural Machine
  Translation
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Xiao Pan
Mingxuan Wang
Liwei Wu
Lei Li
97
207
0
20 May 2021
Learning Language Specific Sub-network for Multilingual Machine
  Translation
Learning Language Specific Sub-network for Multilingual Machine Translation
Zehui Lin
Liwei Wu
Mingxuan Wang
Lei Li
78
84
0
19 May 2021
Rethinking Skip Connection with Layer Normalization in Transformers and
  ResNets
Rethinking Skip Connection with Layer Normalization in Transformers and ResNets
Fenglin Liu
Xuancheng Ren
Zhiyuan Zhang
Xu Sun
Yuexian Zou
AI4CE
81
69
0
15 May 2021
Global Structure-Aware Drum Transcription Based on Self-Attention
  Mechanisms
Global Structure-Aware Drum Transcription Based on Self-Attention Mechanisms
Ryoto Ishizuka
Ryo Nishikimi
Kazuyoshi Yoshii
53
6
0
12 May 2021
AFINet: Attentive Feature Integration Networks for Image Classification
AFINet: Attentive Feature Integration Networks for Image Classification
Xinglin Pan
Jing Xu
Yu Pan
Liangjiang Wen
Wenxiang Lin
Kun Bai
Zenglin Xu
68
10
0
10 May 2021
SafeDrug: Dual Molecular Graph Encoders for Recommending Effective and
  Safe Drug Combinations
SafeDrug: Dual Molecular Graph Encoders for Recommending Effective and Safe Drug Combinations
Chaoqi Yang
Cao Xiao
Fenglong Ma
Lucas Glass
Jimeng Sun
46
85
0
05 May 2021
Inpainting Transformer for Anomaly Detection
Inpainting Transformer for Anomaly Detection
Jonathan Pirnay
K. Chai
ViT
207
169
0
28 Apr 2021
Domain Adaptation and Multi-Domain Adaptation for Neural Machine
  Translation: A Survey
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
Danielle Saunders
AI4CE
130
91
0
14 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
105
87
0
13 Apr 2021
ODE Transformer: An Ordinary Differential Equation-Inspired Model for
  Neural Machine Translation
ODE Transformer: An Ordinary Differential Equation-Inspired Model for Neural Machine Translation
Bei Li
Quan Du
Tao Zhou
Shuhan Zhou
Xin Zeng
Tong Xiao
Jingbo Zhu
63
23
0
06 Apr 2021
Grounding Dialogue Systems via Knowledge Graph Aware Decoding with
  Pre-trained Transformers
Grounding Dialogue Systems via Knowledge Graph Aware Decoding with Pre-trained Transformers
Debanjan Chaudhuri
Md. Rony
Jens Lehmann
66
12
0
30 Mar 2021
API2Com: On the Improvement of Automatically Generated Code Comments
  Using API Documentations
API2Com: On the Improvement of Automatically Generated Code Comments Using API Documentations
Ramin Shahbazi
Rishab Sharma
Fatemeh H. Fard
92
26
0
19 Mar 2021
3D Human Pose Estimation with Spatial and Temporal Transformers
3D Human Pose Estimation with Spatial and Temporal Transformers
Ce Zheng
Sijie Zhu
Matías Mendieta
Taojiannan Yang
Chong Chen
Zhengming Ding
ViT
173
456
0
18 Mar 2021
Translating the Unseen? Yoruba-English MT in Low-Resource,
  Morphologically-Unmarked Settings
Translating the Unseen? Yoruba-English MT in Low-Resource, Morphologically-Unmarked Settings
Ife Adebara
Muhammad Abdul-Mageed
Miikka Silfverberg
61
6
0
07 Mar 2021
Hardware Acceleration of Fully Quantized BERT for Efficient Natural
  Language Processing
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing
Zejian Liu
Gang Li
Jian Cheng
MQ
55
61
0
04 Mar 2021
Centroid Transformers: Learning to Abstract with Attention
Centroid Transformers: Learning to Abstract with Attention
Lemeng Wu
Xingchao Liu
Qiang Liu
3DPC
103
29
0
17 Feb 2021
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
P. Ranade
Aritran Piplai
Sudip Mittal
A. Joshi
Tim Finin
110
71
0
08 Feb 2021
Automated Query Reformulation for Efficient Search based on Query Logs
  From Stack Overflow
Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow
Kaibo Cao
Chunyang Chen
Sebastian Baltes
Christoph Treude
Xiang Chen
71
60
0
01 Feb 2021
To Understand Representation of Layer-aware Sequence Encoders as
  Multi-order-graph
To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph
Sufeng Duan
Hai Zhao
MILM
64
0
0
16 Jan 2021
Investigating the Vision Transformer Model for Image Retrieval Tasks
Investigating the Vision Transformer Model for Image Retrieval Tasks
S. Gkelios
Y. Boutalis
S. Chatzichristofis
VLMViT
75
30
0
11 Jan 2021
An Efficient Transformer Decoder with Compressed Sub-layers
An Efficient Transformer Decoder with Compressed Sub-layers
Yanyang Li
Ye Lin
Tong Xiao
Jingbo Zhu
88
30
0
03 Jan 2021
Optimizing Deeper Transformers on Small Datasets
Optimizing Deeper Transformers on Small Datasets
Peng Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
113
69
0
30 Dec 2020
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence
  Learning
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning
Xuebo Liu
Longyue Wang
Derek F. Wong
Liang Ding
Lidia S. Chao
Zhaopeng Tu
AI4CE
66
35
0
29 Dec 2020
Learning Light-Weight Translation Models from Deep Transformer
Learning Light-Weight Translation Models from Deep Transformer
Bei Li
Ziyang Wang
Hui Liu
Quan Du
Tong Xiao
Chunliang Zhang
Jingbo Zhu
VLM
171
40
0
27 Dec 2020
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
233
2,278
0
23 Dec 2020
RealFormer: Transformer Likes Residual Attention
RealFormer: Transformer Likes Residual Attention
Ruining He
Anirudh Ravula
Bhargav Kanagal
Joshua Ainslie
76
110
0
21 Dec 2020
Multi-Interactive Attention Network for Fine-grained Feature Learning in
  CTR Prediction
Multi-Interactive Attention Network for Fine-grained Feature Learning in CTR Prediction
Kai Zhang
Hao Qian
Daixin Wang
Qi Liu
Longfei Li
Jun Zhou
Jianhui Ma
Enhong Chen
HAI
64
50
0
13 Dec 2020
Improving Gradient Flow with Unrolled Highway Expectation Maximization
Improving Gradient Flow with Unrolled Highway Expectation Maximization
C. Song
Eunseok Kim
Inwook Shim
28
2
0
09 Dec 2020
Pre-Trained Image Processing Transformer
Pre-Trained Image Processing Transformer
Hanting Chen
Yunhe Wang
Tianyu Guo
Chang Xu
Yiping Deng
Zhenhua Liu
Siwei Ma
Chunjing Xu
Chao Xu
Wen Gao
VLMViT
171
1,690
0
01 Dec 2020
Optimizing Transformer for Low-Resource Neural Machine Translation
Optimizing Transformer for Low-Resource Neural Machine Translation
Ali Araabi
Christof Monz
VLM
86
78
0
04 Nov 2020
Layer-Wise Multi-View Learning for Neural Machine Translation
Layer-Wise Multi-View Learning for Neural Machine Translation
Qiang Wang
Changliang Li
Yue Zhang
Tong Xiao
Jingbo Zhu
26
3
0
03 Nov 2020
Dual-decoder Transformer for Joint Automatic Speech Recognition and
  Multilingual Speech Translation
Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Hang Le
J. Pino
Changhan Wang
Jiatao Gu
D. Schwab
Laurent Besacier
115
83
0
02 Nov 2020
Memory Attentive Fusion: External Language Model Integration for
  Transformer-based Sequence-to-Sequence Model
Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model
Mana Ihori
Ryo Masumura
Naoki Makishima
Tomohiro Tanaka
Akihiko Takashima
Shota Orihashi
KELM
30
1
0
29 Oct 2020
Recent Developments on ESPnet Toolkit Boosted by Conformer
Recent Developments on ESPnet Toolkit Boosted by Conformer
Pengcheng Guo
Florian Boyer
Xuankai Chang
Tomoki Hayashi
Yosuke Higuchi
...
Jing Shi
Shinji Watanabe
Kun Wei
Wangyou Zhang
Yuekai Zhang
89
263
0
26 Oct 2020
Accelerating Training of Transformer-Based Language Models with
  Progressive Layer Dropping
Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping
Minjia Zhang
Yuxiong He
AI4CE
48
104
0
26 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
750
41,796
0
22 Oct 2020
BERT for Joint Multichannel Speech Dereverberation with Spatial-aware
  Tasks
BERT for Joint Multichannel Speech Dereverberation with Spatial-aware Tasks
Yang Jiao
29
0
0
21 Oct 2020
Multi-Unit Transformers for Neural Machine Translation
Multi-Unit Transformers for Neural Machine Translation
Jianhao Yan
Fandong Meng
Jie Zhou
57
17
0
21 Oct 2020
Dual Averaging is Surprisingly Effective for Deep Learning Optimization
Dual Averaging is Surprisingly Effective for Deep Learning Optimization
Samy Jelassi
Aaron Defazio
56
5
0
20 Oct 2020
Training Flexible Depth Model by Multi-Task Learning for Neural Machine
  Translation
Training Flexible Depth Model by Multi-Task Learning for Neural Machine Translation
Qiang Wang
Tong Xiao
Jingbo Zhu
42
2
0
16 Oct 2020
Chatbot Interaction with Artificial Intelligence: Human Data
  Augmentation with T5 and Language Transformer Ensemble for Text
  Classification
Chatbot Interaction with Artificial Intelligence: Human Data Augmentation with T5 and Language Transformer Ensemble for Text Classification
Jordan J. Bird
Anikó Ekárt
Diego Resende Faria
61
60
0
12 Oct 2020
Query-Key Normalization for Transformers
Query-Key Normalization for Transformers
Alex Henry
Prudhvi Raj Dachapally
S. Pawar
Yuxuan Chen
59
91
0
08 Oct 2020
Shallow-to-Deep Training for Neural Machine Translation
Shallow-to-Deep Training for Neural Machine Translation
Bei Li
Ziyang Wang
Hui Liu
Yufan Jiang
Quan Du
Tong Xiao
Huizhen Wang
Jingbo Zhu
69
49
0
08 Oct 2020
Semantic Evaluation for Text-to-SQL with Distilled Test Suites
Semantic Evaluation for Text-to-SQL with Distilled Test Suites
Ruiqi Zhong
Tao Yu
Dan Klein
67
135
0
06 Oct 2020
PRover: Proof Generation for Interpretable Reasoning over Rules
PRover: Proof Generation for Interpretable Reasoning over Rules
Swarnadeep Saha
Sayan Ghosh
Shashank Srivastava
Joey Tianyi Zhou
ReLMLRM
87
78
0
06 Oct 2020
Efficient Inference For Neural Machine Translation
Efficient Inference For Neural Machine Translation
Y. Hsu
Sarthak Garg
Yi-Hsiu Liao
Ilya Chatsviorkin
AI4CE
60
12
0
06 Oct 2020
WeChat Neural Machine Translation Systems for WMT20
WeChat Neural Machine Translation Systems for WMT20
Fandong Meng
Jianhao Yan
Yijin Liu
Yuan Gao
Xia Zeng
...
Peng Li
Ming Chen
Jie Zhou
Sifan Liu
Hao Zhou
99
21
0
01 Oct 2020
Previous
1234567
Next