Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.12460
Cited By
Adaptive Gradient Quantization for Data-Parallel SGD
23 October 2020
Fartash Faghri
Iman Tabrizian
I. Markov
Dan Alistarh
Daniel M. Roy
Ali Ramezani-Kebrya
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adaptive Gradient Quantization for Data-Parallel SGD"
17 / 17 papers shown
Title
Differentiable Weightless Neural Networks
Alan T. L. Bacellar
Zachary Susskind
Mauricio Breternitz Jr.
E. John
L. John
P. Lima
F. M. G. França
34
3
0
14 Oct 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
39
6
0
09 Apr 2024
Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates
Ahmad Rammal
Kaja Gruntkowska
Nikita Fedin
Eduard A. Gorbunov
Peter Richtárik
50
5
0
15 Oct 2023
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Ali Ramezani-Kebrya
Kimon Antonakopoulos
Igor Krawczuk
Justin Deschenaux
V. Cevher
41
3
0
17 Aug 2023
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
James OÑeill
Sourav Dutta
VLM
MQ
47
1
0
12 Jul 2023
FedREP: A Byzantine-Robust, Communication-Efficient and Privacy-Preserving Framework for Federated Learning
Yi-Rui Yang
Kun Wang
Wulu Li
FedML
52
3
0
09 Mar 2023
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
Jaeyong Song
Jinkyu Yim
Jaewon Jung
Hongsun Jang
H. Kim
Youngsok Kim
Jinho Lee
GNN
34
25
0
24 Jan 2023
Adaptive Compression for Communication-Efficient Distributed Training
Maksim Makarenko
Elnur Gasanov
Rustem Islamov
Abdurakhmon Sadiev
Peter Richtárik
55
14
0
31 Oct 2022
L-GreCo: Layerwise-Adaptive Gradient Compression for Efficient and Accurate Deep Learning
Mohammadreza Alimohammadi
I. Markov
Elias Frantar
Dan Alistarh
40
5
0
31 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
39
24
0
19 Oct 2022
Empirical Analysis on Top-k Gradient Sparsification for Distributed Deep Learning in a Supercomputing Environment
Daegun Yoon
Sangyoon Oh
28
0
0
18 Sep 2022
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
AI4CE
31
11
0
02 Jun 2022
Communication-Efficient Distributed Learning via Sparse and Adaptive Stochastic Gradient
Xiaoge Deng
Dongsheng Li
Tao Sun
Xicheng Lu
FedML
28
0
0
08 Dec 2021
What Do We Mean by Generalization in Federated Learning?
Honglin Yuan
Warren Morningstar
Lin Ning
K. Singhal
OOD
FedML
46
71
0
27 Oct 2021
Fundamental limits of over-the-air optimization: Are analog schemes optimal?
Shubham K. Jha
Prathamesh Mayekar
Himanshu Tyagi
29
7
0
11 Sep 2021
NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization
Ali Ramezani-Kebrya
Fartash Faghri
Ilya Markov
V. Aksenov
Dan Alistarh
Daniel M. Roy
MQ
65
31
0
28 Apr 2021
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
42
33
0
04 Mar 2021
1