ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.10054
  4. Cited By
AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

14 October 2024
Peijun Qing
Chongyang Gao
Yefan Zhou
Xingjian Diao
Yaoqing Yang
Soroush Vosoughi
    MoMe
    MoE
ArXivPDFHTML

Papers citing "AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality"

19 / 19 papers shown
Title
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Vignesh Kothapalli
Tianyu Pang
Shenyang Deng
Zongmin Liu
Yaoqing Yang
69
4
0
07 Jun 2024
Higher Layers Need More LoRA Experts
Higher Layers Need More LoRA Experts
Chongyang Gao
Kezhen Chen
Jinmeng Rao
Baochen Sun
Ruibo Liu
Daiyi Peng
Yawen Zhang
Xiaoyuan Guo
Jie Yang
V. Subrahmanian
MoE
44
50
0
13 Feb 2024
Expedited Training of Visual Conditioned Language Generation via
  Redundancy Reduction
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction
Yiren Jian
Tingkai Liu
Yunzhe Tao
Chunhui Zhang
Soroush Vosoughi
HX Yang
VLM
39
10
0
05 Oct 2023
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA
  Composition
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Chengsong Huang
Qian Liu
Bill Yuchen Lin
Tianyu Pang
Chao Du
Min Lin
MoMe
99
211
0
25 Jul 2023
The Interpolating Information Criterion for Overparameterized Models
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
63
9
0
15 Jul 2023
Spectral Evolution and Invariance in Linear-width Neural Networks
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
71
18
0
11 Nov 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
278
1,245
0
20 Sep 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
486
6,240
0
05 Apr 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
189
195
0
17 Feb 2022
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
285
4,408
0
27 Oct 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
342
1,702
0
15 Oct 2021
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
201
3,750
0
03 Sep 2021
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
795
42,055
0
28 May 2020
Predicting trends in the quality of state-of-the-art neural networks
  without access to training or testing data
Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data
Charles H. Martin
Tongsu Peng
Peng
Michael W. Mahoney
80
108
0
17 Feb 2020
Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very
  Large Pre-Trained Deep Neural Networks
Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks
Charles H. Martin
Michael W. Mahoney
42
56
0
24 Jan 2019
Traditional and Heavy-Tailed Self Regularization in Neural Network
  Models
Traditional and Heavy-Tailed Self Regularization in Neural Network Models
Charles H. Martin
Michael W. Mahoney
69
124
0
24 Jan 2019
CommonsenseQA: A Question Answering Challenge Targeting Commonsense
  Knowledge
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
140
1,733
0
02 Nov 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from
  Random Matrix Theory and Implications for Learning
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
101
201
0
02 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,159
0
20 Apr 2018
1