DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models

10 September 2024

Papers citing "DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models"

3 / 3 papers shown

Title
MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness Zihao Zheng Xiuping Cui Size Zheng Maoliang Li Jiayu Chen Yun Liang Xiang Chen MQ MoE 64 0 0 27 Mar 2025
Mixture-of-Experts with Expert Choice Routing Yan-Quan Zhou Tao Lei Han-Chu Liu Nan Du Yanping Huang Vincent Zhao Andrew M. Dai Zhifeng Chen Quoc V. Le James Laudon MoE 160 329 0 18 Feb 2022
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 264 4,489 0 23 Jan 2020