Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.10559
Cited By
OrchMoE: Efficient Multi-Adapter Learning with Task-Skill Synergy
19 January 2024
Haowen Wang
Tao Sun
Kaixiang Ji
Jian Wang
Cong Fan
Jinjie Gu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"OrchMoE: Efficient Multi-Adapter Learning with Task-Skill Synergy"
12 / 12 papers shown
Title
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
193
11
0
23 Jan 2025
Multi-Head Adapter Routing for Cross-Task Generalization
Lucas Caccia
Edoardo Ponti
Zhan Su
Matheus Pereira
Nicolas Le Roux
Alessandro Sordoni
51
23
0
07 Nov 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
123
846
0
16 Apr 2022
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
298
358
0
18 Feb 2022
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
477
10,367
0
17 Jun 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
579
4,047
0
18 Apr 2021
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDL
AI4CE
137
1,545
0
18 Mar 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
246
4,261
0
01 Jan 2021
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
442
20,181
0
23 Oct 2019
Categorical Reparameterization with Gumbel-Softmax
Eric Jang
S. Gu
Ben Poole
BDL
339
5,364
0
03 Nov 2016
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
Chris J. Maddison
A. Mnih
Yee Whye Teh
BDL
196
2,533
0
02 Nov 2016
SGDR: Stochastic Gradient Descent with Warm Restarts
I. Loshchilov
Frank Hutter
ODL
333
8,130
0
13 Aug 2016
1