Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.06840
Cited By
ZeRO-Offload: Democratizing Billion-Scale Model Training
18 January 2021
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ZeRO-Offload: Democratizing Billion-Scale Model Training"
50 / 254 papers shown
Title
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances
Jiangfei Duan
Ziang Song
Xupeng Miao
Xiaoli Xi
Dahua Lin
Harry Xu
Minjia Zhang
Zhihao Jia
44
10
0
21 Mar 2024
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Yaowei Zheng
Richong Zhang
Junhao Zhang
Yanhan Ye
Zheyan Luo
Zhangchi Feng
Yongqiang Ma
35
368
0
20 Mar 2024
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Sun Ao
Weilin Zhao
Xu Han
Cheng Yang
Zhiyuan Liu
Chuan Shi
Maosong Sun
GNN
32
8
0
14 Mar 2024
Cyclic Data Parallelism for Efficient Parallelism of Deep Neural Networks
Louis Fournier
Edouard Oyallon
33
0
0
13 Mar 2024
Characterization of Large Language Model Development in the Datacenter
Qi Hu
Zhisheng Ye
Zerui Wang
Guoteng Wang
Mengdie Zhang
...
Dahua Lin
Xiaolin Wang
Yingwei Luo
Yonggang Wen
Tianwei Zhang
48
43
0
12 Mar 2024
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
Hongsun Jang
Jaeyong Song
Jaewon Jung
Jaeyoung Park
Youngsok Kim
Jinho Lee
29
11
0
11 Mar 2024
From English to ASIC: Hardware Implementation with Large Language Model
Emil Goh
Maoyang Xiang
I-Chyn Wey
T. Teo
28
6
0
11 Mar 2024
LLM-Oriented Retrieval Tuner
Si Sun
Hanqing Zhang
Zhiyuan Liu
Jie Bao
Dawei Song
RALM
38
0
0
04 Mar 2024
On the Compressibility of Quantized Large Language Models
Yu Mao
Weilan Wang
Hongchao Du
Nan Guan
Chun Jason Xue
MQ
28
6
0
03 Mar 2024
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
Xuanlei Zhao
Bin Jia
Hao Zhou
Ziming Liu
Shenggan Cheng
Yang You
24
4
0
02 Mar 2024
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Suyuchen Wang
I. Kobyzev
Peng Lu
Mehdi Rezagholizadeh
Bang Liu
35
11
0
29 Feb 2024
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Muyang Li
Tianle Cai
Jiaxin Cao
Qinsheng Zhang
Han Cai
Junjie Bai
Yangqing Jia
Ming-Yu Liu
Kai Li
Song Han
DiffM
29
41
0
29 Feb 2024
Amplifying Training Data Exposure through Fine-Tuning with Pseudo-Labeled Memberships
Myung Gyo Oh
Hong Eun Ahn
L. Park
T.-H. Kwon
MIALM
AAML
31
0
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
42
2
0
19 Feb 2024
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Yihua Zhang
Pingzhi Li
Junyuan Hong
Jiaxiang Li
Yimeng Zhang
...
Wotao Yin
Mingyi Hong
Zhangyang Wang
Sijia Liu
Tianlong Chen
20
45
0
18 Feb 2024
MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache
Leyang Xue
Yao Fu
Zhan Lu
Luo Mai
Mahesh Marina
MoE
16
6
0
25 Jan 2024
LR-CNN: Lightweight Row-centric Convolutional Neural Network Training for Memory Reduction
Zhigang Wang
Hangyu Yang
Ning Wang
Chuanfei Xu
Jie Nie
Zhiqiang Wei
Yu Gu
Ge Yu
19
0
0
21 Jan 2024
PartIR: Composing SPMD Partitioning Strategies for Machine Learning
Sami Alabed
Daniel Belov
Bart Chrzaszcz
Juliana Franco
Dominik Grewe
...
Michael Schaarschmidt
Timur Sitdikov
Agnieszka Swietlik
Dimitrios Vytiniotis
Joel Wee
28
3
0
20 Jan 2024
AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference
Xuanlei Zhao
Shenggan Cheng
Guangyang Lu
Jiarui Fang
Hao Zhou
Bin Jia
Ziming Liu
Yang You
MQ
17
3
0
19 Jan 2024
GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching
Cong Guo
Rui Zhang
Jiale Xu
Jingwen Leng
Zihan Liu
...
Minyi Guo
Hao Wu
Shouren Zhao
Junping Zhao
Ke Zhang
VLM
78
10
0
16 Jan 2024
Extending LLMs' Context Window with 100 Samples
Yikai Zhang
Junlong Li
Pengfei Liu
29
11
0
13 Jan 2024
Training and Serving System of Foundation Models: A Comprehensive Survey
Jiahang Zhou
Yanyu Chen
Zicong Hong
Wuhui Chen
Yue Yu
Tao Zhang
Hui Wang
Chuan-fu Zhang
Zibin Zheng
ALM
32
5
0
05 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xintao Hu
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
29
65
0
04 Jan 2024
Fast Inference of Mixture-of-Experts Language Models with Offloading
Artyom Eliseev
Denis Mazur
MoE
19
42
0
28 Dec 2023
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
27
214
0
25 Dec 2023
Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models
Alan Chan
Ben Bucknall
Herbie Bradley
David M. Krueger
14
6
0
22 Dec 2023
Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Alexander Borzunov
Max Ryabinin
Artem Chumachenko
Dmitry Baranchuk
Tim Dettmers
Younes Belkada
Pavel Samygin
Colin Raffel
MoE
ALM
13
39
0
13 Dec 2023
Stateful Large Language Model Serving with Pensieve
Lingfan Yu
Jinyang Li
RALM
KELM
LLMAG
39
12
0
09 Dec 2023
KwaiAgents: Generalized Information-seeking Agent System with Large Language Models
Haojie Pan
Zepeng Zhai
Hao Yuan
Yaojia Lv
Ruiji Fu
Ming Liu
Zhongyuan Wang
Bing Qin
LLMAG
RALM
18
10
0
08 Dec 2023
Holmes: Towards Distributed Training Across Clusters with Heterogeneous NIC Environment
Fei Yang
Shuang Peng
Ning Sun
Fangyu Wang
Ke Tan
Fu Wu
Jiezhong Qiu
Aimin Pan
22
4
0
06 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
27
22
0
01 Dec 2023
vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training
Jehyeon Bang
Yujeong Choi
Myeongwoo Kim
Yongdeok Kim
Minsoo Rhu
27
15
0
27 Nov 2023
Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search
Zhiqi Lin
Youshan Miao
Guanbin Xu
Cheng Li
Olli Saarikivi
Saeed Maleki
Fan Yang
12
6
0
26 Nov 2023
HongTu: Scalable Full-Graph GNN Training on Multiple GPUs (via communication-optimized CPU data offloading)
Qiange Wang
Yao Chen
Weng-Fai Wong
Bingsheng He
GNN
20
9
0
25 Nov 2023
NeutronOrch: Rethinking Sample-based GNN Training under CPU-GPU Heterogeneous Environments
Xin Ai
Qiange Wang
Chunyu Cao
Yanfeng Zhang
Chaoyi Chen
Hao Yuan
Yu Gu
Ge Yu
GNN
41
5
0
22 Nov 2023
Applications of Large Scale Foundation Models for Autonomous Driving
Yu Huang
Yue Chen
Zhu Li
ELM
AI4CE
LRM
ALM
LM&Ro
61
15
0
20 Nov 2023
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning
Han Guo
P. Greengard
Eric P. Xing
Yoon Kim
MQ
36
43
0
20 Nov 2023
Zero redundancy distributed learning with differential privacy
Zhiqi Bu
Justin Chiu
Ruixuan Liu
Sheng Zha
George Karypis
45
8
0
20 Nov 2023
Just-in-time Quantization with Processing-In-Memory for Efficient ML Training
M. Ibrahim
Shaizeen Aga
Ada Li
Suchita Pati
Mahzabeen Islam
21
3
0
08 Nov 2023
Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models
Longteng Zhang
Xiang Liu
Zeyu Li
Xinglin Pan
Peijie Dong
...
Rui Guo
Xin Wang
Qiong Luo
S. Shi
Xiaowen Chu
41
7
0
07 Nov 2023
Remember what you did so you know what to do next
Manuel R. Ciosici
Alex Hedges
Yash Kankanampati
Justin Martin
Marjorie Freedman
R. Weischedel
LM&Ro
20
0
0
30 Oct 2023
ROAM: memory-efficient large DNN training via optimized operator ordering and memory layout
Huiyao Shu
Ang Wang
Ziji Shi
Hanyu Zhao
Yong Li
Lu Lu
OffRL
26
2
0
30 Oct 2023
CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment
Jixiang Hong
Quan Tu
C. Chen
Xing Gao
Ji Zhang
Rui Yan
ALM
14
11
0
25 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Ruoyu Sun
Zhimin Luo
19
47
0
16 Oct 2023
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
Baodong Wu
Lei Xia
Qingping Li
Kangyu Li
Xu Chen
Yongqiang Guo
Tieyao Xiang
Yuheng Chen
Shigang Li
27
11
0
16 Oct 2023
G10: Enabling An Efficient Unified GPU Memory and Storage Architecture with Smart Tensor Migrations
Haoyang Zhang
Yirui Eric Zhou
Yu Xue
Yiqi Liu
Jian Huang
14
16
0
13 Oct 2023
Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model
Qichen Ye
Junling Liu
Dading Chong
Peilin Zhou
Yining Hua
...
Meng Cao
Ziming Wang
Xuxin Cheng
Andrew Liu
Zhenhua Guo
AI4MH
LM&MA
ELM
30
20
0
13 Oct 2023
Rethinking Memory and Communication Cost for Efficient Large Language Model Training
Chan Wu
Hanxiao Zhang
Lin Ju
Jinjing Huang
Youshao Xiao
...
Siyuan Li
Fanzhuang Meng
Lei Liang
Xiaolu Zhang
Jun Zhou
18
4
0
09 Oct 2023
Generative Judge for Evaluating Alignment
Junlong Li
Shichao Sun
Weizhe Yuan
Run-Ze Fan
Hai Zhao
Pengfei Liu
ELM
ALM
35
76
0
09 Oct 2023
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly
Herbert Woisetschläger
Alexander Erben
Shiqiang Wang
R. Mayer
Hans-Arno Jacobsen
FedML
34
17
0
04 Oct 2023
Previous
1
2
3
4
5
6
Next