ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.12236
  4. Cited By
Direct Neural Machine Translation with Task-level Mixture of Experts
  models
v1v2 (latest)

Direct Neural Machine Translation with Task-level Mixture of Experts models

18 October 2023
Isidora Chara Tourni
Subhajit Naskar
    MoE
ArXiv (abs)PDFHTML

Papers citing "Direct Neural Machine Translation with Task-level Mixture of Experts models"

26 / 26 papers shown
Title
Tutel: Adaptive Mixture-of-Experts at Scale
Tutel: Adaptive Mixture-of-Experts at Scale
Changho Hwang
Wei Cui
Yifan Xiong
Ziyue Yang
Ze Liu
...
Joe Chau
Peng Cheng
Fan Yang
Mao Yang
Y. Xiong
MoE
182
121
0
07 Jun 2022
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture
  of Experts
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLMMoE
165
202
0
06 Jun 2022
Building Machine Translation Systems for the Next Thousand Languages
Building Machine Translation Systems for the Next Thousand Languages
Ankur Bapna
Isaac Caswell
Julia Kreutzer
Orhan Firat
D. Esch
...
Apurva Shah
Yanping Huang
Zhiwen Chen
Yonghui Wu
Macduff Hughes
109
101
0
09 May 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILMLRM
535
6,301
0
05 Apr 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
192
203
0
17 Feb 2022
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Nan Du
Yanping Huang
Andrew M. Dai
Simon Tong
Dmitry Lepikhin
...
Kun Zhang
Quoc V. Le
Yonghui Wu
Zhiwen Chen
Claire Cui
ALMMoE
233
829
0
13 Dec 2021
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
255
111
0
24 Sep 2021
Scaling Vision with Sparse Mixture of Experts
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
129
610
0
10 Jun 2021
Improving Zero-Shot Translation by Disentangling Positional Information
Improving Zero-Shot Translation by Disentangling Positional Information
Danni Liu
Jan Niehues
James Cross
Francisco Guzmán
Xian Li
90
49
0
30 Dec 2020
Subword Segmentation and a Single Bridge Language Affect Zero-Shot
  Neural Machine Translation
Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation
Annette Rios Gonzales
Mathias Müller
Rico Sennrich
52
19
0
03 Nov 2020
Complete Multilingual Neural Machine Translation
Complete Multilingual Neural Machine Translation
Markus Freitag
Orhan Firat
78
44
0
20 Oct 2020
Scalable Transfer Learning with Expert Models
Scalable Transfer Learning with Expert Models
J. Puigcerver
C. Riquelme
Basil Mustafa
Cédric Renggli
André Susano Pinto
Sylvain Gelly
Daniel Keysers
N. Houlsby
131
64
0
28 Sep 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic
  Sharding
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
132
1,191
0
30 Jun 2020
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Su Lin Blodgett
Solon Barocas
Hal Daumé
Hanna M. Wallach
157
1,257
0
28 May 2020
BLEURT: Learning Robust Metrics for Text Generation
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
103
1,506
0
09 Apr 2020
Pivot-based Transfer Learning for Neural Machine Translation between
  Non-English Languages
Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages
Yunsu Kim
P. Petrov
Pavel Petrushkov
Shahram Khadivi
Hermann Ney
LRM
78
84
0
20 Sep 2019
Improved Zero-shot Neural Machine Translation via Ignoring Spurious
  Correlations
Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations
Jiatao Gu
Yong Wang
Kyunghyun Cho
Victor O.K. Li
58
107
0
04 Jun 2019
The Missing Ingredient in Zero-Shot Neural Machine Translation
The Missing Ingredient in Zero-Shot Neural Machine Translation
N. Arivazhagan
Ankur Bapna
Orhan Firat
Roee Aharoni
Melvin Johnson
Wolfgang Macherey
83
117
0
17 Mar 2019
Massively Multilingual Neural Machine Translation
Massively Multilingual Neural Machine Translation
Roee Aharoni
Melvin Johnson
Orhan Firat
LRMAI4CE
85
489
0
28 Feb 2019
SentencePiece: A simple and language independent subword tokenizer and
  detokenizer for Neural Text Processing
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
206
3,531
0
19 Aug 2018
A neural interlingua for multilingual machine translation
A neural interlingua for multilingual machine translation
Y. Lu
Phillip Keung
Faisal Ladhak
Vikas Bhardwaj
Shaonan Zhang
Jason Sun
AI4CE
93
125
0
23 Apr 2018
A Teacher-Student Framework for Zero-Resource Neural Machine Translation
A Teacher-Student Framework for Zero-Resource Neural Machine Translation
Yun Chen
Yang Liu
Yong Cheng
Victor O.K. Li
117
148
0
02 May 2017
Neural Machine Translation with Pivot Languages
Neural Machine Translation with Pivot Languages
Yong Cheng
Yang Liu
Qian Yang
Maosong Sun
Wenyuan Xu
58
97
0
15 Nov 2016
Multi-Way, Multilingual Neural Machine Translation with a Shared
  Attention Mechanism
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
Orhan Firat
Kyunghyun Cho
Yoshua Bengio
LRMAIMat
277
627
0
06 Jan 2016
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
450
20,606
0
10 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
580
27,338
0
01 Sep 2014
1