Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.09979
Cited By
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
15 December 2023
Shihan Dou
Enyu Zhou
Yan Liu
Songyang Gao
Jun Zhao
Wei Shen
Yuhao Zhou
Zhiheng Xi
Xiao Wang
Xiaoran Fan
Shiliang Pu
Jiang Zhu
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
CLL
MoE
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin"
16 / 16 papers shown
Title
CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuning
Jinyuan Feng
Chaopeng Wei
Tenghai Qiu
Tianyi Hu
Zhiqiang Pu
MoE
66
0
0
23 May 2025
AdaptGCD: Multi-Expert Adapter Tuning for Generalized Category Discovery
Yuxun Qu
Yongqiang Tang
Chenyang Zhang
Wensheng Zhang
101
0
0
29 Oct 2024
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Ruijia Niu
D. Wu
Rose Yu
Yi-An Ma
70
2
0
09 Oct 2024
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Chengsong Huang
Qian Liu
Bill Yuchen Lin
Tianyu Pang
Chao Du
Min Lin
MoMe
94
210
0
25 Jul 2023
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
181
3,117
0
20 Oct 2022
A Review on Language Models as Knowledge Bases
Badr AlKhamissi
Millicent Li
Asli Celikyilmaz
Mona T. Diab
Marjan Ghazvininejad
KELM
77
185
0
12 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
871
12,916
0
04 Mar 2022
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Nan Du
Yanping Huang
Andrew M. Dai
Simon Tong
Dmitry Lepikhin
...
Kun Zhang
Quoc V. Le
Yonghui Wu
Zhiwen Chen
Claire Cui
ALM
MoE
209
812
0
13 Dec 2021
Towards a Unified View of Parameter-Efficient Transfer Learning
Junxian He
Chunting Zhou
Xuezhe Ma
Taylor Berg-Kirkpatrick
Graham Neubig
AAML
129
933
0
08 Oct 2021
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
101
600
0
10 Jun 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
540
4,036
0
18 Apr 2021
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
89
1,162
0
30 Jun 2020
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
Adam Roberts
Colin Raffel
Noam M. Shazeer
KELM
106
890
0
10 Feb 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
571
2,664
0
03 Sep 2019
Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
Shashi Narayan
Shay B. Cohen
Mirella Lapata
AILaw
119
1,674
0
27 Aug 2018
RACE: Large-scale ReAding Comprehension Dataset From Examinations
Guokun Lai
Qizhe Xie
Hanxiao Liu
Yiming Yang
Eduard H. Hovy
ELM
180
1,347
0
15 Apr 2017
1