Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.16828
Cited By
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
26 February 2024
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training Neural Networks from Scratch with Parallel Low-Rank Adapters"
17 / 17 papers shown
Title
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary B. Charles
Gabriel Teston
Lucio Dery
Keith Rush
Nova Fallen
Zachary Garrett
Arthur Szlam
Arthur Douillard
208
1
0
12 Mar 2025
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
Zhengwu Liu
Ruijie Zhang
Zihan Wang
Zi Yang
Paul Hovland
Bogdan Nicolae
Franck Cappello
Z. Zhang
49
0
0
16 Feb 2025
Neural Precision Polarization: Simplifying Neural Network Inference with Dual-Level Precision
Dinithi Jayasuriya
Nastaran Darabi
Maeesha Binte Hashem
A. R. Trivedi
MQ
36
1
0
06 Nov 2024
MoIN: Mixture of Introvert Experts to Upcycle an LLM
Ajinkya Tejankar
K. Navaneet
Ujjawal Panchal
Kossar Pourahmadi
Hamed Pirsiavash
MoE
29
0
0
13 Oct 2024
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Prateek Yadav
Colin Raffel
Mohammed Muqeeth
Lucas Caccia
Haokun Liu
Tianlong Chen
Joey Tianyi Zhou
Leshem Choshen
Alessandro Sordoni
MoMe
49
21
0
13 Aug 2024
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
Nikolaos Dimitriadis
Pascal Frossard
F. Fleuret
MoE
67
6
0
10 Jul 2024
A Survey on LoRA of Large Language Models
Yuren Mao
Yuhang Ge
Yijiang Fan
Wenyi Xu
Yu Mi
Zhonghao Hu
Yunjun Gao
ALM
58
25
0
08 Jul 2024
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining
Andi Han
Jiaxiang Li
Wei Huang
Mingyi Hong
Akiko Takeda
Pratik Jawanpuria
Bamdev Mishra
46
10
0
04 Jun 2024
CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization
Zi Yang
Samridhi Choudhary
Xinfeng Xie
Cao Gao
Siegfried Kunzmann
Zheng-Wei Zhang
VLM
43
6
0
23 May 2024
DiLoCo: Distributed Low-Communication Training of Language Models
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
Rachita Chhaparia
Yani Donchev
A. Kuncoro
MarcÁurelio Ranzato
Arthur Szlam
Jiajun Shen
61
32
0
14 Nov 2023
PopulAtion Parameter Averaging (PAPA)
Alexia Jolicoeur-Martineau
Emy Gervais
Kilian Fatras
Yan Zhang
Simon Lacoste-Julien
MoMe
42
17
0
06 Apr 2023
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
255
318
0
11 Sep 2022
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
194
387
0
06 Nov 2021
A Field Guide to Federated Optimization
Jianyu Wang
Zachary B. Charles
Zheng Xu
Gauri Joshi
H. B. McMahan
...
Mi Zhang
Tong Zhang
Chunxiang Zheng
Chen Zhu
Wennan Zhu
FedML
187
412
0
14 Jul 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
289
2,611
0
04 May 2021
Optimizing Mode Connectivity via Neuron Alignment
N. Joseph Tatro
Pin-Yu Chen
Payel Das
Igor Melnyk
P. Sattigeri
Rongjie Lai
MoMe
225
80
0
05 Sep 2020
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
298
39,238
0
01 Sep 2014
1