Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.09687
Cited By
MoIN: Mixture of Introvert Experts to Upcycle an LLM
13 October 2024
Ajinkya Tejankar
K. Navaneet
Ujjawal Panchal
Kossar Pourahmadi
Hamed Pirsiavash
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MoIN: Mixture of Introvert Experts to Upcycle an LLM"
13 / 13 papers shown
Title
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
Nikolaos Dimitriadis
Pascal Frossard
François Fleuret
MoE
224
8
0
10 Jul 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Jiawei Zhao
Zhenyu Zhang
Beidi Chen
Zhangyang Wang
A. Anandkumar
Yuandong Tian
99
224
0
06 Mar 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
82
12
0
26 Feb 2024
Mixtral of Experts
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
...
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
155
1,117
0
08 Jan 2024
ReLoRA: High-Rank Training Through Low-Rank Updates
Vladislav Lialin
Namrata Shivagunde
Sherin Muckatira
Anna Rumshisky
BDL
79
117
0
11 Jul 2023
TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
Joel Jang
Seonghyeon Ye
Changho Lee
Sohee Yang
Joongbo Shin
Janghoon Han
Gyeonghun Kim
Minjoon Seo
CLL
KELM
102
98
0
29 Apr 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Samyam Rajbhandari
Conglong Li
Z. Yao
Minjia Zhang
Reza Yazdani Aminabadi
A. A. Awan
Jeff Rasley
Yuxiong He
112
302
0
14 Jan 2022
Towards Continual Knowledge Learning of Language Models
Joel Jang
Seonghyeon Ye
Sohee Yang
Joongbo Shin
Janghoon Han
Gyeonghun Kim
Stanley Jungkyu Choi
Minjoon Seo
CLL
KELM
297
161
0
07 Oct 2021
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELM
MU
97
463
0
18 Apr 2021
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
244
1,551
0
24 May 2019
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Leland McInnes
John Healy
James Melville
199
9,473
0
09 Feb 2018
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
374
7,561
0
02 Dec 2016
Deep Learning of Representations: Looking Forward
Yoshua Bengio
218
682
0
02 May 2013
1