ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.05444
  4. Cited By
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient
  MoE for Instruction Tuning

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

11 September 2023
Ted Zadouri
Ahmet Üstün
Arash Ahmadian
Beyza Ermics
Acyr Locatelli
Sara Hooker
    MoE
ArXivPDFHTML

Papers citing "Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning"

30 / 80 papers shown
Title
Towards Incremental Learning in Large Language Models: A Critical Review
Towards Incremental Learning in Large Language Models: A Critical Review
M. Jovanovic
Peter Voss
ELM
CLL
KELM
46
5
0
28 Apr 2024
MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models
  with Sparse Mixture of Low-Rank Adapter Experts
MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts
Yusheng Liao
Shuyang Jiang
Yu Wang
Yanfeng Wang
MoE
36
5
0
13 Apr 2024
Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient
  Finetuning
Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning
Yijiang Liu
Rongyu Zhang
Huanrui Yang
Kurt Keutzer
Yuan Du
Li Du
Shanghang Zhang
MoE
44
6
0
13 Apr 2024
SilverSight: A Multi-Task Chinese Financial Large Language Model Based
  on Adaptive Semantic Space Learning
SilverSight: A Multi-Task Chinese Financial Large Language Model Based on Adaptive Semantic Space Learning
Yuhang Zhou
Zeping Li
Siyu Tian
Yuchen Ni
Sen Liu
Guangnan Ye
Hongfeng Chai
46
1
0
07 Apr 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
150
319
0
21 Mar 2024
Conditional computation in neural networks: principles and research
  trends
Conditional computation in neural networks: principles and research trends
Simone Scardapane
Alessandro Baiocchi
Alessio Devoto
V. Marsocci
Pasquale Minervini
Jary Pomponi
39
1
0
12 Mar 2024
Online Adaptation of Language Models with a Memory of Amortized Contexts
Online Adaptation of Language Models with a Memory of Amortized Contexts
Jihoon Tack
Jaehyung Kim
Eric Mitchell
Jinwoo Shin
Yee Whye Teh
Jonathan Richard Schwarz
KELM
55
18
0
07 Mar 2024
Both Matter: Enhancing the Emotional Intelligence of Large Language
  Models without Compromising the General Intelligence
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence
Weixiang Zhao
Zhuojun Li
Shilong Wang
Yang Wang
Yulin Hu
Yanyan Zhao
Chen Wei
Bing Qin
22
4
0
15 Feb 2024
LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
  Tasks in the Wild
LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild
Ziyu Zhao
Leilei Gan
Guoyin Wang
Wangchunshu Zhou
Hongxia Yang
Kun Kuang
Fei Wu
MoMe
26
31
0
15 Feb 2024
Higher Layers Need More LoRA Experts
Higher Layers Need More LoRA Experts
Chongyang Gao
Kezhen Chen
Jinmeng Rao
Baochen Sun
Ruibo Liu
Daiyi Peng
Yawen Zhang
Xiaoyuan Guo
Jie Yang
V. Subrahmanian
MoE
26
37
0
13 Feb 2024
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Mohammed Muqeeth
Haokun Liu
Yufan Liu
Colin Raffel
MoMe
37
34
0
08 Feb 2024
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture
  of Adapters
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters
Umberto Cappellazzo
Daniele Falavigna
Alessio Brutti
MoE
48
3
0
01 Feb 2024
LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts
  in Instruction Finetuning MLLMs
LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs
Shaoxiang Chen
Zequn Jie
Lin Ma
MoE
50
47
0
29 Jan 2024
OrchMoE: Efficient Multi-Adapter Learning with Task-Skill Synergy
OrchMoE: Efficient Multi-Adapter Learning with Task-Skill Synergy
Haowen Wang
Tao Sun
Kaixiang Ji
Jian Wang
Cong Fan
Jinjie Gu
23
1
0
19 Jan 2024
Mixture of Cluster-conditional LoRA Experts for Vision-language
  Instruction Tuning
Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning
Yunhao Gou
Zhili Liu
Kai Chen
Lanqing Hong
Hang Xu
Aoxue Li
Dit-Yan Yeung
James T. Kwok
Yu Zhang
MoE
MLLM
VLM
49
63
0
19 Dec 2023
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the
  Generative Artificial Intelligence (AI) Research Landscape
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
94
48
0
18 Dec 2023
Adaptive Computation Modules: Granular Conditional Computation For
  Efficient Inference
Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference
Bartosz Wójcik
Alessio Devoto
Karol Pustelnik
Pasquale Minervini
Simone Scardapane
30
5
0
15 Dec 2023
MoSA: Mixture of Sparse Adapters for Visual Efficient Tuning
MoSA: Mixture of Sparse Adapters for Visual Efficient Tuning
Qizhe Zhang
Bocheng Zou
Ruichuan An
Jiaming Liu
Shanghang Zhang
MoE
29
2
0
05 Dec 2023
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of
  Low-rank Experts
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
Jialin Wu
Xia Hu
Yaqing Wang
Bo Pang
Radu Soricut
MoE
27
14
0
01 Dec 2023
SiRA: Sparse Mixture of Low Rank Adaptation
SiRA: Sparse Mixture of Low Rank Adaptation
Yun Zhu
Nevan Wichers
Chu-Cheng Lin
Xinyi Wang
Tianlong Chen
...
Han Lu
Canoee Liu
Liangchen Luo
Jindong Chen
Lei Meng
MoE
35
27
0
15 Nov 2023
When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task
  Medical Applications
When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications
Qidong Liu
Xian Wu
Xiangyu Zhao
Yuanshao Zhu
Derong Xu
Feng Tian
Yefeng Zheng
MoE
39
62
0
21 Oct 2023
Label Supervised LLaMA Finetuning
Label Supervised LLaMA Finetuning
Zongxi Li
Xianming Li
Yuzhang Liu
Haoran Xie
Jing Li
F. Wang
Qing Li
Xiaoqin Zhong
ALM
23
21
0
02 Oct 2023
ConPET: Continual Parameter-Efficient Tuning for Large Language Models
ConPET: Continual Parameter-Efficient Tuning for Large Language Models
Chenyan Song
Xu Han
Zheni Zeng
Kuai Li
Chen Chen
Zhiyuan Liu
Maosong Sun
Taojiannan Yang
CLL
KELM
21
10
0
26 Sep 2023
Soft Merging of Experts with Adaptive Routing
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
39
47
0
06 Jun 2023
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
333
0
18 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
447
8,699
0
28 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
218
1,664
0
15 Oct 2021
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
119
107
0
24 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
283
3,879
0
18 Apr 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,532
0
23 Jan 2020
Previous
12