ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.01299
  4. Cited By
Fine-tuning Language Models over Slow Networks using Activation
  Compression with Guarantees
v1v2v3 (latest)

Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees

2 June 2022
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees"

10 / 10 papers shown
Title
Accelerating AllReduce with a Persistent Straggler
Accelerating AllReduce with a Persistent Straggler
Arjun Devraj
Eric Ding
Abhishek Vijaya Kumar
Robert Kleinberg
Rachee Singh
56
0
0
29 May 2025
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Yaojie Lu
Kurt Keutzer
Jianfei Chen
Song Han
MQ
173
11
0
25 Oct 2024
Hazards from Increasingly Accessible Fine-Tuning of Downloadable
  Foundation Models
Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models
Alan Chan
Ben Bucknall
Herbie Bradley
David M. Krueger
63
6
0
22 Dec 2023
Distributed Inference and Fine-tuning of Large Language Models Over The
  Internet
Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Alexander Borzunov
Max Ryabinin
Artem Chumachenko
Dmitry Baranchuk
Tim Dettmers
Younes Belkada
Pavel Samygin
Colin Raffel
MoEALM
65
42
0
13 Dec 2023
Fast Distributed Inference Serving for Large Language Models
Fast Distributed Inference Serving for Large Language Models
Bingyang Wu
Yinmin Zhong
Zili Zhang
Gang Huang
Xuanzhe Liu
Xin Jin
88
102
0
10 May 2023
Scaling Expert Language Models with Unsupervised Domain Discovery
Scaling Expert Language Models with Unsupervised Domain Discovery
Suchin Gururangan
Margaret Li
M. Lewis
Weijia Shi
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoE
102
48
0
24 Mar 2023
Quantized Distributed Training of Large Models with Convergence
  Guarantees
Quantized Distributed Training of Large Models with Convergence Guarantees
I. Markov
Adrian Vladu
Qi Guo
Dan Alistarh
MQ
97
11
0
05 Feb 2023
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
74
9
0
06 Jan 2023
RCD-SGD: Resource-Constrained Distributed SGD in Heterogeneous
  Environment via Submodular Partitioning
RCD-SGD: Resource-Constrained Distributed SGD in Heterogeneous Environment via Submodular Partitioning
Haoze He
Parijat Dube
58
1
0
02 Nov 2022
lo-fi: distributed fine-tuning without communication
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
108
24
0
19 Oct 2022
1