Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.18859
Cited By
v1
v2 (latest)
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models
29 October 2023
Zhixu Du
Shiyu Li
Yuhao Wu
Xiangyu Jiang
Jingwei Sun
Qilin Zheng
Yongkai Wu
Ang Li
Hai Helen Li
Yiran Chen
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
Github (17★)
Papers citing
"SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models"
3 / 3 papers shown
Title
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE Inference
Yujie Zhang
Shivam Aggarwal
T. Mitra
MoE
167
1
0
16 Dec 2024
Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference
Andrii Skliar
T. V. Rozendaal
Romain Lepert
Todor Boinovski
M. V. Baalen
Markus Nagel
Paul N. Whatmough
B. Bejnordi
MoE
174
2
0
27 Nov 2024
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Weilin Cai
Juyong Jiang
Le Qin
Junwei Cui
Sunghun Kim
Jiayi Huang
185
10
0
07 Apr 2024
1