Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.12399
Cited By
Turn Waste into Worth: Rectifying Top-
k
k
k
Router of MoE
17 February 2024
Zhiyuan Zeng
Qipeng Guo
Zhaoye Fei
Zhangyue Yin
Yunhua Zhou
Linyang Li
Tianxiang Sun
Hang Yan
Dahua Lin
Xipeng Qiu
MoE
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Turn Waste into Worth: Rectifying Top-$k$ Router of MoE"
6 / 6 papers shown
Title
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Zehan Li
L. Zhang
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
59
0
0
31 Mar 2025
Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Youngseog Chung
Dhruv Malik
J. Schneider
Yuanzhi Li
Aarti Singh
MoE
34
1
0
02 Sep 2024
WDMoE: Wireless Distributed Large Language Models with Mixture of Experts
Nan Xue
Yaping Sun
Zhiyong Chen
Meixia Tao
Xiaodong Xu
Liang Qian
Shuguang Cui
Ping Zhang
MoE
23
9
0
06 May 2024
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
Dengchun Li
Yingzi Ma
Naizheng Wang
Zhengmao Ye
Zhiyuan Cheng
...
Yan Zhang
Lei Duan
Jie Zuo
Cal Yang
Mingjie Tang
MoE
34
42
0
22 Apr 2024
From Sparse to Soft Mixtures of Experts
J. Puigcerver
C. Riquelme
Basil Mustafa
N. Houlsby
MoE
121
114
0
02 Aug 2023
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
327
0
18 Feb 2022
1