Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.09830
Cited By
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
24 January 2023
Jaeyong Song
Jinkyu Yim
Jaewon Jung
Hongsun Jang
H. Kim
Youngsok Kim
Jinho Lee
GNN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression"
16 / 16 papers shown
Title
Hiding Communication Cost in Distributed LLM Training via Micro-batch Co-execution
Haiquan Wang
Chaoyi Ruan
Jia He
Jiaqi Ruan
Chengjie Tang
Xiaosong Ma
Cheng-rong Li
73
1
0
24 Nov 2024
VcLLM: Video Codecs are Secretly Tensor Codecs
Ceyu Xu
Yongji Wu
Xinyu Yang
Beidi Chen
Matthew Lentz
Danyang Zhuo
Lisa Wu Wills
50
0
0
29 Jun 2024
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
Hongsun Jang
Jaeyong Song
Jaewon Jung
Jaeyoung Park
Youngsok Kim
Jinho Lee
29
11
0
11 Mar 2024
Activations and Gradients Compression for Model-Parallel Training
Mikhail Rudakov
Aleksandr Beznosikov
Yaroslav Kholodov
Alexander Gasnikov
31
1
0
15 Jan 2024
Training and Serving System of Foundation Models: A Comprehensive Survey
Jiahang Zhou
Yanyu Chen
Zicong Hong
Wuhui Chen
Yue Yu
Tao Zhang
Hui Wang
Chuan-fu Zhang
Zibin Zheng
ALM
32
5
0
05 Jan 2024
Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production
Chandra Irugalbandara
Ashish Mahendra
Roland Daynauth
T. Arachchige
Jayanaka L. Dantanarayana
K. Flautner
Lingjia Tang
Yiping Kang
Jason Mars
ELM
25
14
0
20 Dec 2023
vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training
Jehyeon Bang
Yujeong Choi
Myeongwoo Kim
Yongdeok Kim
Minsoo Rhu
27
15
0
27 Nov 2023
Automatic Task Parallelization of Dataflow Graphs in ML/DL models
Srinjoy Das
Lawrence Rauchwerger
16
0
0
22 Aug 2023
Pipe-BD: Pipelined Parallel Blockwise Distillation
Hongsun Jang
Jaewon Jung
Jaeyong Song
Joonsang Yu
Youngsok Kim
Jinho Lee
MoE
AI4CE
31
2
0
29 Jan 2023
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
19
5
0
06 Jan 2023
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
77
131
0
14 Jul 2021
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Chia-Yu Chen
Jiamin Ni
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
...
Naigang Wang
Swagath Venkataramani
Vijayalakshmi Srinivasan
Wei Zhang
K. Gopalakrishnan
27
18
0
21 Apr 2021
An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems
A. Abdelmoniem
Ahmed Elzanaty
Mohamed-Slim Alouini
Marco Canini
51
74
0
26 Jan 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
177
414
0
18 Jan 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks
Minjie Wang
Da Zheng
Zihao Ye
Quan Gan
Mufei Li
...
J. Zhao
Haotong Zhang
Alex Smola
Jinyang Li
Zheng-Wei Zhang
AI4CE
GNN
197
746
0
03 Sep 2019
1