Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.01739
Cited By
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
29 January 2024
Fuzhao Xue
Zian Zheng
Yao Fu
Jinjie Ni
Zangwei Zheng
Wangchunshu Zhou
Yang You
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models"
50 / 72 papers shown
Title
UMoE: Unifying Attention and FFN with Shared Experts
Yuanhang Yang
Chaozheng Wang
Jing Li
MoE
29
0
0
12 May 2025
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu
Xiuhong Li
Zhihang Yuan
Size Zheng
Jiangfei Duan
Xingcheng Zhang
Dahua Lin
MQ
MoE
163
0
0
09 May 2025
Backdoor Attacks Against Patch-based Mixture of Experts
Cedric Chan
Jona te Lintelo
S. Picek
AAML
MoE
151
0
0
03 May 2025
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts
Qingyue Wang
Qi Pang
Xixun Lin
Shuai Wang
Daoyuan Wu
MoE
59
0
0
24 Apr 2025
Unveiling Hidden Collaboration within Mixture-of-Experts in Large Language Models
Yuanbo Tang
Yan Tang
N. Zhang
Meixuan Chen
Yang Li
MoE
41
0
0
16 Apr 2025
Plasticity-Aware Mixture of Experts for Learning Under QoE Shifts in Adaptive Video Streaming
Zhiqiang He
Zhi Liu
44
0
0
14 Apr 2025
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models
Hongcheng Guo
Juntao Yao
Boyang Wang
Junjia Du
Shaosheng Cao
Donglin Di
Shun Zhang
Zehan Li
MoE
40
0
0
10 Apr 2025
On the Spatial Structure of Mixture-of-Experts in Transformers
Daniel Bershatsky
Ivan V. Oseledets
MoE
40
0
0
06 Apr 2025
Beyond Standard MoE: Mixture of Latent Experts for Resource-Efficient Language Models
Zehua Liu
Han Wu
Ruifeng She
Xiaojin Fu
Xiongwei Han
Tao Zhong
Mingxuan Yuan
MoE
47
0
0
29 Mar 2025
Hacia la interpretabilidad de la detección anticipada de riesgos de depresión utilizando grandes modelos de lenguaje
Horacio Thompson
Maximiliano Sapino
Edgardo Ferretti
Marcelo Errecalde
53
0
0
26 Mar 2025
Mixture of Lookup Experts
Shibo Jie
Yehui Tang
Kai Han
Yongqian Li
Duyu Tang
Zhi-Hong Deng
Yunhe Wang
MoE
49
0
0
20 Mar 2025
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
Chenpeng Wu
Qiqi Gu
Heng Shi
Jianguo Yao
Haibing Guan
MoE
48
0
0
13 Mar 2025
From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches
Wei Ruan
Tianze Yang
Yue Zhou
Tianming Liu
Jin Lu
MoMe
90
0
0
13 Mar 2025
MoE-Gen: High-Throughput MoE Inference on a Single GPU with Module-Based Batching
Tairan Xu
Leyang Xue
Zhan Lu
Adrian Jackson
Luo Mai
MoE
90
1
0
12 Mar 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
135
1
0
10 Mar 2025
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
158
0
0
10 Mar 2025
MoFE: Mixture of Frozen Experts Architecture
Jean Seo
Jaeyoon Kim
Hyopil Shin
MoE
167
0
0
09 Mar 2025
Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts
Shwai He
Weilin Cai
Jiayi Huang
Ang Li
MoE
39
1
0
07 Mar 2025
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Y. Huang
Peng Ye
Chenyu Huang
Jianjian Cao
Lin Zhang
Baopu Li
Gang Yu
Tao Chen
MoMe
MoE
55
1
0
03 Mar 2025
Efficiently Editing Mixture-of-Experts Models with Compressed Experts
Y. He
Yang Liu
Chen Liang
Hany Awadalla
MoE
63
1
0
01 Mar 2025
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
153
14
0
20 Feb 2025
Semantic Specialization in MoE Appears with Scale: A Study of DeepSeek R1 Expert Specialization
M. L. Olson
Neale Ratzlaff
Musashi Hinck
Man Luo
Sungduk Yu
Chendi Xue
Vasudev Lal
MoE
LRM
51
1
0
15 Feb 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Zhilin Wang
Muneeza Azmart
Ang Li
R. Horesh
Mikhail Yurochkin
118
1
0
11 Feb 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Z. Yang
Mike Zheng Shou
MoE
76
0
0
10 Feb 2025
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline
Zhiyuan Fang
Yuegui Huang
Zicong Hong
Yufeng Lyu
Wuhui Chen
Yue Yu
Fan Yu
Zibin Zheng
MoE
48
0
0
09 Feb 2025
Ensembles of Low-Rank Expert Adapters
Yinghao Li
Vianne Gao
Chao Zhang
MohamadAli Torkamani
67
0
0
31 Jan 2025
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
Zihan Qiu
Zeyu Huang
Jian Xu
Kaiyue Wen
Zekun Wang
Rui Men
Ivan Titov
Dayiheng Liu
Jingren Zhou
Junyang Lin
MoE
57
6
0
21 Jan 2025
Part-Of-Speech Sensitivity of Routers in Mixture of Experts Models
Elie Antoine
Frédéric Béchet
Philippe Langlais
MoE
78
0
0
22 Dec 2024
An Entailment Tree Generation Approach for Multimodal Multi-Hop Question Answering with Mixture-of-Experts and Iterative Feedback Mechanism
Qing Zhang
Haocheng Lv
Jie Liu
Z. Chen
Jianyong Duan
Hao Wang
Li He
Mingying Xv
69
1
0
08 Dec 2024
Mixture of Hidden-Dimensions Transformer
Yilong Chen
Junyuan Shang
Zhengyu Zhang
Jiawei Sheng
Tingwen Liu
Shuohuan Wang
Yu Sun
Hua-Hong Wu
Haifeng Wang
MoE
75
0
0
07 Dec 2024
HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting
Shaohan Yu
Pan Deng
Yu Zhao
J. Liu
Ziáng Wang
MoE
182
0
0
30 Nov 2024
Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference
Andrii Skliar
T. V. Rozendaal
Romain Lepert
Todor Boinovski
M. V. Baalen
Markus Nagel
Paul N. Whatmough
B. Bejnordi
MoE
78
1
0
27 Nov 2024
Communication-Efficient Sparsely-Activated Model Training via Sequence Migration and Token Condensation
Fahao Chen
Peng Li
Zicong Hong
Zhou Su
Song Guo
MoMe
MoE
67
0
0
23 Nov 2024
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen
Chengyu Wang
Dakan Wang
Taolin Zhang
Wangyue Li
Xiaofeng He
KELM
80
1
0
23 Nov 2024
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
Nam V. Nguyen
Thong T. Doan
Luong Tran
Van Nguyen
Quang Pham
MoE
69
1
0
01 Nov 2024
ProMoE: Fast MoE-based LLM Serving using Proactive Caching
Xiaoniu Song
Zihang Zhong
Rong Chen
Haibo Chen
MoE
65
4
0
29 Oct 2024
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
Ruisi Cai
Yeonju Ro
Geon-Woo Kim
Peihao Wang
Babak Ehteshami Bejnordi
Aditya Akella
Zhilin Wang
MoE
25
3
0
24 Oct 2024
MoMQ: Mixture-of-Experts Enhances Multi-Dialect Query Generation across Relational and Non-Relational Databases
Zhisheng Lin
Yifu Liu
Zhiling Luo
Jinyang Gao
Yu Li
29
0
0
24 Oct 2024
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts
Xumeng Han
Longhui Wei
Zhiyang Dou
Zipeng Wang
Chenhui Qiang
Xin He
Yingfei Sun
Zhenjun Han
Qi Tian
MoE
45
3
0
21 Oct 2024
Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
Xin Zhou
Ping Nie
Yiwen Guo
Haojie Wei
Zhanqiu Zhang
Pasquale Minervini
Ruotian Ma
Tao Gui
Qi Zhang
Xuanjing Huang
MoE
42
0
0
20 Oct 2024
γ
−
γ-
γ
−
MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Yaxin Luo
Gen Luo
Jiayi Ji
Yiyi Zhou
Xiaoshuai Sun
Zhiqiang Shen
Rongrong Ji
VLM
MoE
37
1
0
17 Oct 2024
Retrieval Instead of Fine-tuning: A Retrieval-based Parameter Ensemble for Zero-shot Learning
Pengfei Jin
Peng Shu
Sekeun Kim
Qing Xiao
S. Song
Cheng Chen
Tianming Liu
Xiang Li
Quanzheng Li
40
1
0
13 Oct 2024
Upcycling Large Language Models into Mixture of Experts
Ethan He
Abhinav Khattar
R. Prenger
V. Korthikanti
Zijie Yan
Tong Liu
Shiqing Fan
Ashwath Aithal
M. Shoeybi
Bryan Catanzaro
MoE
37
9
0
10 Oct 2024
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
32
4
0
09 Oct 2024
Neutral residues: revisiting adapters for model extension
Franck Signe Talla
Hervé Jégou
Edouard Grave
25
0
0
03 Oct 2024
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Yu Sun
Hua-Hong Wu
Sen Su
MoE
26
0
0
02 Oct 2024
Spatial-Temporal Mixture-of-Graph-Experts for Multi-Type Crime Prediction
Ziyang Wu
Fan Liu
Jindong Han
Yuxuan Liang
Hao Liu
33
2
0
24 Sep 2024
Mixture of Diverse Size Experts
Manxi Sun
Wei Liu
Jian Luan
Pengzhi Gao
Bin Wang
MoE
28
1
0
18 Sep 2024
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu
Zeyu Huang
Shuang Cheng
Yizhi Zhou
Zili Wang
Ivan Titov
Jie Fu
MoE
81
2
0
13 Aug 2024
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Xi Lin
Akshat Shrivastava
Liang Luo
Srinivasan Iyer
Mike Lewis
Gargi Gosh
Luke Zettlemoyer
Armen Aghajanyan
MoE
40
20
0
31 Jul 2024
1
2
Next