Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.05444
Cited By
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
11 September 2023
Ted Zadouri
Ahmet Üstün
Arash Ahmadian
Beyza Ermics
Acyr Locatelli
Sara Hooker
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning"
50 / 80 papers shown
Title
A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning
Junzhou Xu
Boyu Diao
MoE
52
0
0
06 May 2025
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Song Wang
Xiaolu Liu
Lingdong Kong
Jianyun Xu
Chunyong Hu
Gongfan Fang
Wentong Li
Jianke Zhu
Xinchao Wang
35
0
0
22 Apr 2025
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
Yimu Wang
Mozhgan Nasr Azadani
Sean Sedwards
Krzysztof Czarnecki
MLLM
MoE
57
0
0
07 Apr 2025
Investigating and Mitigating Stereotype-aware Unfairness in LLM-based Recommendations
Zihuai Zhao
Wenqi Fan
Yao Wu
Qing Li
83
1
0
05 Apr 2025
MetaLoRA: Tensor-Enhanced Adaptive Low-Rank Fine-tuning
Maolin Wang
Xiangyu Zhao
AI4CE
48
0
0
01 Apr 2025
Mixture of Routers
Jia-Chen Zhang
Yu-Jie Xiong
Xi-He Qiu
Chun-Ming Xia
Fei Dai
MoE
76
0
0
30 Mar 2025
Efficient Model Development through Fine-tuning Transfer
Pin-Jie Lin
Rishab Balasubramanian
Fengyuan Liu
Nikhil Kandpal
Tu Vu
66
1
0
25 Mar 2025
Merge then Realign: Simple and Effective Modality-Incremental Continual Learning for Multimodal LLMs
Dingkun Zhang
Shuhan Qi
Xinyu Xiao
Kehai Chen
Xuan Wang
CLL
MoMe
71
0
0
08 Mar 2025
Multi-Level Collaboration in Model Merging
Qi Li
Runpeng Yu
Xinchao Wang
MoMe
FedML
97
0
0
03 Mar 2025
Sample Selection via Contrastive Fragmentation for Noisy Label Regression
C. Kim
Sangwoo Moon
Jihwan Moon
Dongyeon Woo
Gunhee Kim
NoLa
59
0
0
25 Feb 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Chenghao Fan
Zhenyi Lu
Sichen Liu
Xiaoye Qu
Wei Wei
Chengfeng Gu
Yu-Xi Cheng
MoE
216
0
0
24 Feb 2025
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Zhili Liu
Yunhao Gou
Kai Chen
Lanqing Hong
Jiahui Gao
...
Yu Zhang
Zhenguo Li
Xin Jiang
Qiang Liu
James T. Kwok
MoE
108
9
0
20 Feb 2025
A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models
Mengyang Sun
Yihao Wang
Tao Feng
Dan Zhang
Yifan Zhu
J. Tang
MoE
45
0
0
20 Feb 2025
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
155
14
0
20 Feb 2025
Ensembles of Low-Rank Expert Adapters
Yinghao Li
Vianne Gao
Chao Zhang
MohamadAli Torkamani
77
0
0
31 Jan 2025
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning
Ziyu Zhao
Yixiao Zhou
Didi Zhu
Tao Shen
Junfeng Fang
Jing Su
Kun Kuang
Zhongyu Wei
Fei Wu
Yu Cheng
MoE
45
2
0
28 Jan 2025
GraphLoRA: Empowering LLMs Fine-Tuning via Graph Collaboration of MoE
Ting Bai
Yue Yu
Le Huang
Zenan Xu
Zhe Zhao
Chuan Shi
MoE
253
0
0
18 Dec 2024
Investigating Mixture of Experts in Dense Retrieval
Effrosyni Sokli
Pranav Kasela
Georgios Peikos
G. Pasi
MoE
77
1
0
16 Dec 2024
MoSLD: An Extremely Parameter-Efficient Mixture-of-Shared LoRAs for Multi-Task Learning
Lulu Zhao
Weihao Zeng
Xiaofeng Shi
Hua Zhou
MoMe
MoE
85
0
0
12 Dec 2024
MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning
Yufei Ma
Zihan Liang
Huangyu Dai
Bin Chen
D. Gao
...
Linbo Jin
Wen Jiang
Guannan Zhang
Xiaoyan Cai
Libin Yang
MoE
MoMe
99
1
0
10 Dec 2024
PMoL: Parameter Efficient MoE for Preference Mixing of LLM Alignment
Dongxu Liu
Bing Xu
Yinzhuo Chen
Bufan Xu
Wenpeng Lu
Muyun Yang
Tiejun Zhao
MoE
44
1
0
02 Nov 2024
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Xujia Wang
Haiyan Zhao
Shuo Wang
Hanqing Wang
Zhiyuan Liu
MoMe
MoE
40
0
0
30 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
52
3
0
24 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Joey Tianyi Zhou
Tianlong Chen
MoMe
MoE
33
2
0
09 Oct 2024
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
Sheng Wang
Liheng Chen
Pengan Chen
Jingwei Dong
Boyang Xue
Jiyue Jiang
Lingpeng Kong
Chuan Wu
MoE
31
8
0
01 Oct 2024
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE
Xun Zhu
Ying Hu
Fanbin Mo
Miao Li
Ji Wu
54
8
0
26 Sep 2024
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
Dongyang Fan
Bettina Messmer
N. Doikov
Martin Jaggi
MoMe
MoE
50
2
0
20 Sep 2024
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Nikolas Gritsch
Qizhen Zhang
Acyr Locatelli
Sara Hooker
Ahmet Üstün
MoE
58
1
0
28 Aug 2024
Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language Instruction Tuning for Semiconductor Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Chidaksh Ravuru
Geethan Sannidhi
Venkataramana Runkana
45
0
0
27 Aug 2024
Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings
Sagar Srinivas Sakhinana
Geethan Sannidhi
Chidaksh Ravuru
Venkataramana Runkana
AI4TS
35
0
0
24 Aug 2024
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing
Hao Zhou
Zhijun Wang
Shujian Huang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Weihua Luo
Jiajun Chen
CLL
MoE
59
5
0
21 Aug 2024
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Prateek Yadav
Colin Raffel
Mohammed Muqeeth
Lucas Caccia
Haokun Liu
Tianlong Chen
Joey Tianyi Zhou
Leshem Choshen
Alessandro Sordoni
MoMe
51
21
0
13 Aug 2024
Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation
Jingjing Xie
Yuxin Zhang
Mingbao Lin
Liujuan Cao
Rongrong Ji
MQ
41
4
0
07 Aug 2024
MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts
Lin Ning
Harsh Lara
Meiqi Guo
Abhinav Rastogi
MoMe
MoE
37
1
0
02 Aug 2024
Low-Rank Interconnected Adaptation Across Layers
Yibo Zhong
Yao Zhou
OffRL
MoE
50
1
0
13 Jul 2024
Foundation Model Engineering: Engineering Foundation Models Just as Engineering Software
Dezhi Ran
Mengzhou Wu
Wei Yang
Tao Xie
AI4CE
39
1
0
11 Jul 2024
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
63
14
0
08 Jul 2024
Mixture of A Million Experts
Xu Owen He
MoE
46
26
0
04 Jul 2024
Lateralization LoRA: Interleaved Instruction Tuning with Modality-Specialized Adaptations
Zhiyang Xu
Minqian Liu
Ying Shen
Joy Rimchala
Jiaxin Zhang
Qifan Wang
Yu Cheng
Lifu Huang
VLM
39
2
0
04 Jul 2024
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models
Renzhi Wang
Piji Li
KELM
CLL
57
7
0
28 Jun 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
59
2
0
28 Jun 2024
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning
Ziyu Zhao
Leilei Gan
Guoyin Wang
Yuwei Hu
Tao Shen
Hongxia Yang
Kun Kuang
Fei Wu
MoE
MoMe
39
12
0
24 Jun 2024
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models
Zihao Zeng
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
MoE
52
8
0
19 Jun 2024
MoE-RBench
\texttt{MoE-RBench}
MoE-RBench
: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen
Xinyu Zhao
Tianlong Chen
Yu Cheng
MoE
83
5
0
17 Jun 2024
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
Jihwan Bang
Juntae Lee
Kyuhong Shim
Seunghan Yang
Simyung Chang
39
5
0
11 Jun 2024
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
Renzhi Wang
Piji Li
KELM
42
3
0
29 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
Edoardo Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Caccia
Alessandro Sordoni
MoMe
46
31
0
18 May 2024
Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model
Joo Young Choi
Jaesung R. Park
Inkyu Park
Jaewoong Cho
Albert No
Ernest K. Ryu
AI4CE
35
4
0
07 May 2024
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors
Yuan Tang
Xu Han
Xianzhi Li
Qiao Yu
Yixue Hao
Long Hu
Min Chen
37
14
0
02 May 2024
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts
Zefang Liu
Jiahua Luo
MoE
KELM
43
11
0
01 May 2024
1
2
Next