Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.00705
Cited By
Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization
1 November 2021
Yujia Wang
Lu Lin
Jinghui Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization"
15 / 15 papers shown
Title
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
80
0
0
11 Nov 2024
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
Xiaochen Wang
Jiaqi Wang
Houping Xiao
Jianfei Chen
Fenglong Ma
MedIm
109
7
0
17 Aug 2024
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Hanlin Tang
Shaoduo Gan
A. A. Awan
Samyam Rajbhandari
Conglong Li
Xiangru Lian
Ji Liu
Ce Zhang
Yuxiong He
AI4CE
62
87
0
04 Feb 2021
Linearly Converging Error Compensated SGD
Eduard A. Gorbunov
D. Kovalev
Dmitry Makarenko
Peter Richtárik
193
79
0
23 Oct 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
46
21
0
24 Jun 2020
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu
Yura Malitsky
P. Mertikopoulos
Volkan Cevher
ODL
53
41
0
21 Mar 2020
Decentralized Deep Learning with Arbitrary Communication Compression
Anastasia Koloskova
Tao R. Lin
Sebastian U. Stich
Martin Jaggi
FedML
39
235
0
22 Jul 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
Debraj Basu
Deepesh Data
C. Karakuş
Suhas Diggavi
MQ
54
405
0
06 Jun 2019
DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression
Hanlin Tang
Xiangru Lian
Chen Yu
Tong Zhang
Ji Liu
40
219
0
15 May 2019
Adaptive Gradient Methods with Dynamic Bound of Learning Rate
Liangchen Luo
Yuanhao Xiong
Yan Liu
Xu Sun
ODL
74
602
0
26 Feb 2019
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Dongruo Zhou
Yiqi Tang
Yuan Cao
Ziyan Yang
Quanquan Gu
52
151
0
16 Aug 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
72
192
0
18 Jun 2018
Asynchronous Stochastic Gradient Descent with Delay Compensation
Shuxin Zheng
Qi Meng
Taifeng Wang
Wei Chen
Nenghai Yu
Zhiming Ma
Tie-Yan Liu
98
314
0
27 Sep 2016
Wide Residual Networks
Sergey Zagoruyko
N. Komodakis
330
7,980
0
23 May 2016
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
134
6,626
0
22 Dec 2012
1