ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.10284
  4. Cited By
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for
  Efficient MoE Inference

AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference

19 August 2024
Shuzhang Zhong
Ling Liang
Yuan Wang
Runsheng Wang
Ru Huang
Meng Li
    MoE
ArXivPDFHTML

Papers citing "AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference"

3 / 3 papers shown
Title
D$^{2}$MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
D2^{2}2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
Haodong Wang
Qihua Zhou
Zicong Hong
Song Guo
MoE
58
0
0
17 Apr 2025
Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling
Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling
Yan Li
Pengfei Zheng
Shuang Chen
Zewei Xu
Yuanhao Lai
Yunfei Du
Z. Wang
MoE
137
0
0
06 Mar 2025
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE Inference
DAOP: Data-Aware Offloading and Predictive Pre-Calculation for Efficient MoE Inference
Yujie Zhang
Shivam Aggarwal
T. Mitra
MoE
74
0
0
16 Dec 2024
1