Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.05820
Cited By
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition
10 December 2021
K. Kumatani
R. Gmyr
Andres Felipe Cruz Salinas
Linquan Liu
Wei Zuo
Devang Patel
Eric Sun
Yu Shi
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition"
9 / 9 papers shown
Title
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
50
0
0
02 Apr 2025
Tight Clusters Make Specialized Experts
Stefan K. Nielsen
R. Teo
Laziz U. Abdullaev
Tan M. Nguyen
MoE
66
2
0
21 Feb 2025
MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts
R. Teo
Tan M. Nguyen
MoE
33
3
0
18 Oct 2024
Multi-Head Mixture-of-Experts
Xun Wu
Shaohan Huang
Wenhui Wang
Furu Wei
MoE
36
12
0
23 Apr 2024
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
32
73
0
22 Feb 2023
MoEC: Mixture of Expert Clusters
Yuan Xie
Shaohan Huang
Tianyu Chen
Furu Wei
MoE
40
11
0
19 Jul 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
24
181
0
17 Feb 2022
Multilingual Speech Recognition using Knowledge Transfer across Learning Processes
Rimita Lahiri
K. Kumatani
Eric Sun
Yao Qian
55
6
0
15 Oct 2021
A Configurable Multilingual Model is All You Need to Recognize All Languages
Long Zhou
Jinyu Li
Eric Sun
Shujie Liu
97
40
0
13 Jul 2021
1