ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.00705
  4. Cited By
Communication-Compressed Adaptive Gradient Method for Distributed
  Nonconvex Optimization

Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization

1 November 2021
Yujia Wang
Lu Lin
Jinghui Chen
ArXivPDFHTML

Papers citing "Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization"

15 / 15 papers shown
Title
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
80
0
0
11 Nov 2024
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
Xiaochen Wang
Jiaqi Wang
Houping Xiao
Jianfei Chen
Fenglong Ma
MedIm
109
7
0
17 Aug 2024
1-bit Adam: Communication Efficient Large-Scale Training with Adam's
  Convergence Speed
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Hanlin Tang
Shaoduo Gan
A. A. Awan
Samyam Rajbhandari
Conglong Li
Xiangru Lian
Ji Liu
Ce Zhang
Yuxiong He
AI4CE
62
87
0
04 Feb 2021
Linearly Converging Error Compensated SGD
Linearly Converging Error Compensated SGD
Eduard A. Gorbunov
D. Kovalev
Dmitry Makarenko
Peter Richtárik
193
79
0
23 Oct 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
46
21
0
24 Jun 2020
A new regret analysis for Adam-type algorithms
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu
Yura Malitsky
P. Mertikopoulos
Volkan Cevher
ODL
53
41
0
21 Mar 2020
Decentralized Deep Learning with Arbitrary Communication Compression
Decentralized Deep Learning with Arbitrary Communication Compression
Anastasia Koloskova
Tao R. Lin
Sebastian U. Stich
Martin Jaggi
FedML
39
235
0
22 Jul 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,
  and Local Computations
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
Debraj Basu
Deepesh Data
C. Karakuş
Suhas Diggavi
MQ
54
405
0
06 Jun 2019
DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass
  Error-Compensated Compression
DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression
Hanlin Tang
Xiangru Lian
Chen Yu
Tong Zhang
Ji Liu
40
219
0
15 May 2019
Adaptive Gradient Methods with Dynamic Bound of Learning Rate
Adaptive Gradient Methods with Dynamic Bound of Learning Rate
Liangchen Luo
Yuanhao Xiong
Yan Liu
Xu Sun
ODL
74
602
0
26 Feb 2019
On the Convergence of Adaptive Gradient Methods for Nonconvex
  Optimization
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Dongruo Zhou
Yiqi Tang
Yuan Cao
Ziyan Yang
Quanquan Gu
52
151
0
16 Aug 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training
  Deep Neural Networks
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
72
192
0
18 Jun 2018
Asynchronous Stochastic Gradient Descent with Delay Compensation
Asynchronous Stochastic Gradient Descent with Delay Compensation
Shuxin Zheng
Qi Meng
Taifeng Wang
Wei Chen
Nenghai Yu
Zhiming Ma
Tie-Yan Liu
98
314
0
27 Sep 2016
Wide Residual Networks
Wide Residual Networks
Sergey Zagoruyko
N. Komodakis
330
7,980
0
23 May 2016
ADADELTA: An Adaptive Learning Rate Method
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
134
6,626
0
22 Dec 2012
1