ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.02654
  4. Cited By
Does compressing activations help model parallel training?

Does compressing activations help model parallel training?

6 January 2023
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
ArXivPDFHTML

Papers citing "Does compressing activations help model parallel training?"

4 / 4 papers shown
Title
Activations and Gradients Compression for Model-Parallel Training
Activations and Gradients Compression for Model-Parallel Training
Mikhail Rudakov
Aleksandr Beznosikov
Yaroslav Kholodov
Alexander Gasnikov
39
1
0
15 Jan 2024
Towards a Better Theoretical Understanding of Independent Subnetwork
  Training
Towards a Better Theoretical Understanding of Independent Subnetwork Training
Egor Shulgin
Peter Richtárik
AI4CE
34
6
0
28 Jun 2023
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,833
0
17 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1