ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.06770
  4. Cited By
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis

Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis

11 November 2024
Zhijie Chen
Qiaobo Li
A. Banerjee
    FedML
ArXivPDFHTML

Papers citing "Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis"

50 / 67 papers shown
Title
MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes
Igor Sokolov
Peter Richtárik
127
1
0
22 Dec 2024
Why Transformers Need Adam: A Hessian Perspective
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Ruoyu Sun
Zhimin Luo
77
53
0
26 Feb 2024
Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness
  Constants
Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants
Peter Richtárik
Elnur Gasanov
Konstantin Burlachenko
25
4
0
16 Feb 2024
Correlation Aware Sparsified Mean Estimation Using Random Projection
Correlation Aware Sparsified Mean Estimation Using Random Projection
Shuli Jiang
Pranay Sharma
Gauri Joshi
71
1
0
29 Oct 2023
Momentum Provably Improves Error Feedback!
Momentum Provably Improves Error Feedback!
Ilyas Fatkhullin
Alexander Tyurin
Peter Richtárik
68
23
0
24 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
70
143
0
23 May 2023
Revisiting Gradient Clipping: Stochastic bias and tight convergence
  guarantees
Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees
Anastasia Koloskova
Hadrien Hendrikx
Sebastian U. Stich
149
57
0
02 May 2023
$z$-SignFedAvg: A Unified Stochastic Sign-based Compression for
  Federated Learning
zzz-SignFedAvg: A Unified Stochastic Sign-based Compression for Federated Learning
Zhiwei Tang
Yanmeng Wang
Tsung-Hui Chang
FedML
50
14
0
06 Feb 2023
High-Probability Bounds for Stochastic Optimization and Variational
  Inequalities: the Case of Unbounded Variance
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance
Abdurakhmon Sadiev
Marina Danilova
Eduard A. Gorbunov
Samuel Horváth
Gauthier Gidel
Pavel Dvurechensky
Alexander Gasnikov
Peter Richtárik
25
40
0
02 Feb 2023
Communication-Efficient Federated Learning for Heterogeneous Edge
  Devices Based on Adaptive Gradient Quantization
Communication-Efficient Federated Learning for Heterogeneous Edge Devices Based on Adaptive Gradient Quantization
Heting Liu
Fang He
Guohong Cao
FedML
MQ
63
25
0
16 Dec 2022
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth
  Channel and Vulnerability
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
Zhao Song
Yitan Wang
Zheng Yu
Licheng Zhang
FedML
61
29
0
15 Oct 2022
Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance)
  Noise in Federated Learning
Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning
Haibo Yang
Pei-Yuan Qiu
Jia Liu
FedML
55
12
0
03 Oct 2022
Efficient-Adam: Communication-Efficient Distributed Adam
Efficient-Adam: Communication-Efficient Distributed Adam
Congliang Chen
Li Shen
Wei Liu
Zhi-Quan Luo
44
19
0
28 May 2022
On Distributed Adaptive Optimization with Gradient Compression
On Distributed Adaptive Optimization with Gradient Compression
Xiaoyun Li
Belhal Karimi
Ping Li
44
27
0
11 May 2022
A Communication-Efficient Distributed Gradient Clipping Algorithm for
  Training Deep Neural Networks
A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks
Mingrui Liu
Zhenxun Zhuang
Yunwei Lei
Chunyang Liao
53
19
0
10 May 2022
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication
  Acceleration! Finally!
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
Konstantin Mishchenko
Grigory Malinovsky
Sebastian U. Stich
Peter Richtárik
26
153
0
18 Feb 2022
3PC: Three Point Compressors for Communication-Efficient Distributed
  Training and a Better Theory for Lazy Aggregation
3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation
Peter Richtárik
Igor Sokolov
Ilyas Fatkhullin
Elnur Gasanov
Zhize Li
Eduard A. Gorbunov
51
32
0
02 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
62
10
0
31 Jan 2022
Communication-Compressed Adaptive Gradient Method for Distributed
  Nonconvex Optimization
Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization
Yujia Wang
Lu Lin
Jinghui Chen
71
17
0
01 Nov 2021
Permutation Compressors for Provably Faster Distributed Nonconvex
  Optimization
Permutation Compressors for Provably Faster Distributed Nonconvex Optimization
Rafal Szlendak
Alexander Tyurin
Peter Richtárik
156
35
0
07 Oct 2021
High-probability Bounds for Non-Convex Stochastic Optimization with
  Heavy Tails
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails
Ashok Cutkosky
Harsh Mehta
47
61
0
28 Jun 2021
On Large-Cohort Training for Federated Learning
On Large-Cohort Training for Federated Learning
Zachary B. Charles
Zachary Garrett
Zhouyuan Huo
Sergei Shmulyian
Virginia Smith
FedML
58
112
0
15 Jun 2021
EF21: A New, Simpler, Theoretically Better, and Practically Faster Error
  Feedback
EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback
Peter Richtárik
Igor Sokolov
Ilyas Fatkhullin
57
146
0
09 Jun 2021
FedNL: Making Newton-Type Methods Applicable to Federated Learning
FedNL: Making Newton-Type Methods Applicable to Federated Learning
M. Safaryan
Rustem Islamov
Xun Qian
Peter Richtárik
FedML
57
80
0
05 Jun 2021
DRIVE: One-bit Distributed Mean Estimation
DRIVE: One-bit Distributed Mean Estimation
S. Vargaftik
Ran Ben-Basat
Amit Portnoy
Gal Mendelson
Y. Ben-Itzhak
Michael Mitzenmacher
OOD
FedML
102
53
0
18 May 2021
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training
  with LAMB's Convergence Speed
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed
Conglong Li
A. A. Awan
Hanlin Tang
Samyam Rajbhandari
Yuxiong He
59
33
0
13 Apr 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Hessian Eigenspectra of More Realistic Nonlinear Models
Zhenyu Liao
Michael W. Mahoney
51
29
0
02 Mar 2021
Learning Neural Network Subspaces
Learning Neural Network Subspaces
Mitchell Wortsman
Maxwell Horton
Carlos Guestrin
Ali Farhadi
Mohammad Rastegari
UQCV
48
87
0
20 Feb 2021
MARINA: Faster Non-Convex Distributed Learning with Compression
MARINA: Faster Non-Convex Distributed Learning with Compression
Eduard A. Gorbunov
Konstantin Burlachenko
Zhize Li
Peter Richtárik
67
110
0
15 Feb 2021
1-bit Adam: Communication Efficient Large-Scale Training with Adam's
  Convergence Speed
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Hanlin Tang
Shaoduo Gan
A. A. Awan
Samyam Rajbhandari
Conglong Li
Xiangru Lian
Ji Liu
Ce Zhang
Yuxiong He
AI4CE
62
86
0
04 Feb 2021
Improving Neural Network Training in Low Dimensional Random Bases
Improving Neural Network Training in Low Dimensional Random Bases
Frithjof Gressmann
Zach Eaton-Rosen
Carlo Luschi
64
28
0
09 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
538
40,739
0
22 Oct 2020
A High Probability Analysis of Adaptive SGD with Momentum
A High Probability Analysis of Adaptive SGD with Momentum
Xiaoyun Li
Francesco Orabona
114
67
0
28 Jul 2020
CSER: Communication-efficient SGD with Error Reset
CSER: Communication-efficient SGD with Error Reset
Cong Xie
Shuai Zheng
Oluwasanmi Koyejo
Indranil Gupta
Mu Li
Yanghua Peng
55
41
0
26 Jul 2020
FetchSGD: Communication-Efficient Federated Learning with Sketching
FetchSGD: Communication-Efficient Federated Learning with Sketching
D. Rothchild
Ashwinee Panda
Enayat Ullah
Nikita Ivkin
Ion Stoica
Vladimir Braverman
Joseph E. Gonzalez
Raman Arora
FedML
60
367
0
15 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp
  Guarantees
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
71
276
0
02 Jul 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient
  Clipping
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping
Eduard A. Gorbunov
Marina Danilova
Alexander Gasnikov
48
122
0
21 May 2020
Quantized Adam with Error Feedback
Quantized Adam with Error Feedback
Congliang Chen
Li Shen
Haozhi Huang
Wei Liu
ODL
MQ
26
34
0
29 Apr 2020
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
  Non-Asymptotic Concentration
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
Wenlong Mou
C. J. Li
Martin J. Wainwright
Peter L. Bartlett
Michael I. Jordan
54
76
0
09 Apr 2020
Adaptive Federated Optimization
Adaptive Federated Optimization
Sashank J. Reddi
Zachary B. Charles
Manzil Zaheer
Zachary Garrett
Keith Rush
Jakub Konecný
Sanjiv Kumar
H. B. McMahan
FedML
161
1,431
0
29 Feb 2020
On Biased Compression for Distributed Learning
On Biased Compression for Distributed Learning
Aleksandr Beznosikov
Samuel Horváth
Peter Richtárik
M. Safaryan
50
189
0
27 Feb 2020
PyHessian: Neural Networks Through the Lens of the Hessian
PyHessian: Neural Networks Through the Lens of the Hessian
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
51
302
0
16 Dec 2019
FedPAQ: A Communication-Efficient Federated Learning Method with
  Periodic Averaging and Quantization
FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization
Amirhossein Reisizadeh
Aryan Mokhtari
Hamed Hassani
Ali Jadbabaie
Ramtin Pedarsani
FedML
233
772
0
28 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
69
52
0
24 Jul 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,
  and Local Computations
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
Debraj Basu
Deepesh Data
C. Karakuş
Suhas Diggavi
MQ
54
405
0
06 Jun 2019
Communication-Efficient Distributed Blockwise Momentum SGD with
  Error-Feedback
Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback
Shuai Zheng
Ziyue Huang
James T. Kwok
54
115
0
27 May 2019
Sub-Weibull distributions: generalizing sub-Gaussian and sub-Exponential
  properties to heavier-tailed distributions
Sub-Weibull distributions: generalizing sub-Gaussian and sub-Exponential properties to heavier-tailed distributions
M. Vladimirova
Stéphane Girard
Hien Nguyen
Julyan Arbel
66
89
0
13 May 2019
On the Convergence of Adam and Beyond
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
87
2,494
0
19 Apr 2019
Communication-efficient distributed SGD with Sketching
Communication-efficient distributed SGD with Sketching
Nikita Ivkin
D. Rothchild
Enayat Ullah
Vladimir Braverman
Ion Stoica
R. Arora
FedML
39
200
0
12 Mar 2019
Compressing Gradient Optimizers via Count-Sketches
Compressing Gradient Optimizers via Count-Sketches
Ryan Spring
Anastasios Kyrillidis
Vijai Mohan
Anshumali Shrivastava
31
36
0
01 Feb 2019
12
Next