ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12656
  4. Cited By
HyperMoE: Towards Better Mixture of Experts via Transferring Among
  Experts

HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts

20 February 2024
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
    MoE
ArXivPDFHTML

Papers citing "HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts"

17 / 17 papers shown
Title
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation
Yunlong Liang
Fandong Meng
Jie Zhou
MoE
14
0
0
20 May 2025
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
221
1
0
10 Mar 2025
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts
Zhenpeng Su
Xing Wu
Zijia Lin
Yizhe Xiong
Minxuan Lv
Guangyuan Ma
Hui Chen
Songlin Hu
Guiguang Ding
MoE
29
3
0
21 Oct 2024
On the Risk of Evidence Pollution for Malicious Social Text Detection in
  the Era of LLMs
On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs
Herun Wan
Minnan Luo
Zhixiong Su
Guang Dai
Xiang Zhao
DeLMO
35
0
0
16 Oct 2024
PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQL
PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQL
Ruilin Luo
Liyuan Wang
Binghuai Lin
Zicheng Lin
Yujiu Yang
LRM
35
6
0
21 Sep 2024
Mixture of Diverse Size Experts
Mixture of Diverse Size Experts
Manxi Sun
Wei Liu
Jian Luan
Pengzhi Gao
Bin Wang
MoE
28
1
0
18 Sep 2024
Layerwise Recurrent Router for Mixture-of-Experts
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu
Zeyu Huang
Shuang Cheng
Yizhi Zhou
Zili Wang
Ivan Titov
Jie Fu
MoE
81
2
0
13 Aug 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in
  Mixture-of-Experts
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts
Zhenpeng Su
Zijia Lin
Xue Bai
Xing Wu
Yizhe Xiong
...
Guangyuan Ma
Hui Chen
Guiguang Ding
Wei Zhou
Songlin Hu
MoE
34
5
0
13 Jul 2024
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
Haoze Wu
Zihan Qiu
Zili Wang
Hang Zhao
Jie Fu
MoE
51
3
0
18 Jun 2024
Read to Play (R2-Play): Decision Transformer with Multimodal Game
  Instruction
Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction
Yonggang Jin
Ge Zhang
Hao Zhao
Tianyu Zheng
Jiawei Guo
Liuyu Xiang
Shawn Yue
Stephen W. Huang
Zhaofeng He
Jie Fu
OffRL
34
4
0
06 Feb 2024
Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning
Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning
Hao Zhao
Jie Fu
Zhaofeng He
105
6
0
18 Oct 2023
Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer
Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer
Ahmet Üstün
Arianna Bisazza
G. Bouma
Gertjan van Noord
Sebastian Ruder
54
32
0
24 May 2022
Multilingual Machine Translation with Hyper-Adapters
Multilingual Machine Translation with Hyper-Adapters
Christos Baziotis
Mikel Artetxe
James Cross
Shruti Bhosale
75
21
0
22 May 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
331
0
18 Feb 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,996
0
20 Apr 2018
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,599
0
17 Apr 2017
Teaching Machines to Read and Comprehend
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
211
3,515
0
10 Jun 2015
1