Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.00277
Cited By
Task-Specific Expert Pruning for Sparse Mixture-of-Experts
1 June 2022
Tianyu Chen
Shaohan Huang
Yuan Xie
Binxing Jiao
Daxin Jiang
Haoyi Zhou
Jianxin Li
Furu Wei
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Task-Specific Expert Pruning for Sparse Mixture-of-Experts"
11 / 11 papers shown
Title
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration
HamidReza Imani
Jiaxin Peng
Peiman Mohseni
Abdolah Amirany
Tarek A. El-Ghazawi
MoE
36
0
0
10 May 2025
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
68
0
0
06 May 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
50
0
0
02 Apr 2025
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
Raman Dutt
Harleen Hanspal
Guoxuan Xia
Petru-Daniel Tudosiu
Alexander Black
Yongxin Yang
Jingyu Sun
Sarah Parisot
MoE
43
0
0
28 Mar 2025
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Xudong Lu
Qi Liu
Yuhui Xu
Aojun Zhou
Siyuan Huang
Bo Zhang
Junchi Yan
Hongsheng Li
MoE
34
26
0
22 Feb 2024
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Wenhao Zheng
Dongsheng Peng
Hongxia Xu
Yun Li
Hongtu Zhu
Tianfan Fu
Huaxiu Yao
Huaxiu Yao
52
5
0
09 Feb 2024
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget
Rui Kong
Yuanchun Li
Qingtian Feng
Weijun Wang
Xiaozhou Ye
Ye Ouyang
Lingyu Kong
Yunxin Liu
MoE
39
8
0
29 Aug 2023
Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model
Yeskendir Koishekenov
Alexandre Berard
Vassilina Nikoulina
MoE
40
29
0
19 Dec 2022
MoEC: Mixture of Expert Clusters
Yuan Xie
Shaohan Huang
Tianyu Chen
Furu Wei
MoE
45
11
0
19 Jul 2022
Tricks for Training Sparse Translation Models
Dheeru Dua
Shruti Bhosale
Vedanuj Goswami
James Cross
M. Lewis
Angela Fan
MoE
150
19
0
15 Oct 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
7,005
0
20 Apr 2018
1