Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis

11 November 2024

Papers citing "Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis"

50 / 67 papers shown

Title
MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes Igor Sokolov Peter Richtárik 127 1 0 22 Dec 2024
Why Transformers Need Adam: A Hessian Perspective Yushun Zhang Congliang Chen Tian Ding Ziniu Li Ruoyu Sun Zhimin Luo 77 53 0 26 Feb 2024
Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants Peter Richtárik Elnur Gasanov Konstantin Burlachenko 25 4 0 16 Feb 2024
Correlation Aware Sparsified Mean Estimation Using Random Projection Shuli Jiang Pranay Sharma Gauri Joshi 71 1 0 29 Oct 2023
Momentum Provably Improves Error Feedback! Ilyas Fatkhullin Alexander Tyurin Peter Richtárik 68 23 0 24 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training Hong Liu Zhiyuan Li David Leo Wright Hall Percy Liang Tengyu Ma VLM 70 143 0 23 May 2023
Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees Anastasia Koloskova Hadrien Hendrikx Sebastian U. Stich 149 57 0 02 May 2023
$z$ -SignFedAvg: A Unified Stochastic Sign-based Compression for Federated Learning Zhiwei Tang Yanmeng Wang Tsung-Hui Chang FedML 50 14 0 06 Feb 2023
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance Abdurakhmon Sadiev Marina Danilova Eduard A. Gorbunov Samuel Horváth Gauthier Gidel Pavel Dvurechensky Alexander Gasnikov Peter Richtárik 25 40 0 02 Feb 2023
Communication-Efficient Federated Learning for Heterogeneous Edge Devices Based on Adaptive Gradient Quantization Heting Liu Fang He Guohong Cao FedML MQ 63 25 0 16 Dec 2022
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability Zhao Song Yitan Wang Zheng Yu Licheng Zhang FedML 61 29 0 15 Oct 2022
Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning Haibo Yang Pei-Yuan Qiu Jia Liu FedML 55 12 0 03 Oct 2022
Efficient-Adam: Communication-Efficient Distributed Adam Congliang Chen Li Shen Wei Liu Zhi-Quan Luo 44 19 0 28 May 2022
On Distributed Adaptive Optimization with Gradient Compression Xiaoyun Li Belhal Karimi Ping Li 44 27 0 11 May 2022
A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks Mingrui Liu Zhenxun Zhuang Yunwei Lei Chunyang Liao 53 19 0 10 May 2022
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally! Konstantin Mishchenko Grigory Malinovsky Sebastian U. Stich Peter Richtárik 26 153 0 18 Feb 2022
3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation Peter Richtárik Igor Sokolov Ilyas Fatkhullin Elnur Gasanov Zhize Li Eduard A. Gorbunov 51 32 0 02 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning Zeke Xie Qian-Yuan Tang Yunfeng Cai Mingming Sun P. Li ODL 62 10 0 31 Jan 2022
Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization Yujia Wang Lu Lin Jinghui Chen 71 17 0 01 Nov 2021
Permutation Compressors for Provably Faster Distributed Nonconvex Optimization Rafal Szlendak Alexander Tyurin Peter Richtárik 156 35 0 07 Oct 2021
High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails Ashok Cutkosky Harsh Mehta 47 61 0 28 Jun 2021
On Large-Cohort Training for Federated Learning Zachary B. Charles Zachary Garrett Zhouyuan Huo Sergei Shmulyian Virginia Smith FedML 58 112 0 15 Jun 2021
EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback Peter Richtárik Igor Sokolov Ilyas Fatkhullin 57 146 0 09 Jun 2021
FedNL: Making Newton-Type Methods Applicable to Federated Learning M. Safaryan Rustem Islamov Xun Qian Peter Richtárik FedML 57 80 0 05 Jun 2021
DRIVE: One-bit Distributed Mean Estimation S. Vargaftik Ran Ben-Basat Amit Portnoy Gal Mendelson Y. Ben-Itzhak Michael Mitzenmacher OOD FedML 102 53 0 18 May 2021
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed Conglong Li A. A. Awan Hanlin Tang Samyam Rajbhandari Yuxiong He 59 33 0 13 Apr 2021
Hessian Eigenspectra of More Realistic Nonlinear Models Zhenyu Liao Michael W. Mahoney 51 29 0 02 Mar 2021
Learning Neural Network Subspaces Mitchell Wortsman Maxwell Horton Carlos Guestrin Ali Farhadi Mohammad Rastegari UQCV 48 87 0 20 Feb 2021
MARINA: Faster Non-Convex Distributed Learning with Compression Eduard A. Gorbunov Konstantin Burlachenko Zhize Li Peter Richtárik 67 110 0 15 Feb 2021
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed Hanlin Tang Shaoduo Gan A. A. Awan Samyam Rajbhandari Conglong Li Xiangru Lian Ji Liu Ce Zhang Yuxiong He AI4CE 62 86 0 04 Feb 2021
Improving Neural Network Training in Low Dimensional Random Bases Frithjof Gressmann Zach Eaton-Rosen Carlo Luschi 64 28 0 09 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 538 40,739 0 22 Oct 2020
A High Probability Analysis of Adaptive SGD with Momentum Xiaoyun Li Francesco Orabona 114 67 0 28 Jul 2020
CSER: Communication-efficient SGD with Error Reset Cong Xie Shuai Zheng Oluwasanmi Koyejo Indranil Gupta Mu Li Yanghua Peng 55 41 0 26 Jul 2020
FetchSGD: Communication-Efficient Federated Learning with Sketching D. Rothchild Ashwinee Panda Enayat Ullah Nikita Ivkin Ion Stoica Vladimir Braverman Joseph E. Gonzalez Raman Arora FedML 60 367 0 15 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp Guarantees Farzin Haddadpour Mohammad Mahdi Kamani Aryan Mokhtari M. Mahdavi FedML 71 276 0 02 Jul 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping Eduard A. Gorbunov Marina Danilova Alexander Gasnikov 48 122 0 21 May 2020
Quantized Adam with Error Feedback Congliang Chen Li Shen Haozhi Huang Wei Liu ODL MQ 26 34 0 29 Apr 2020
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration Wenlong Mou C. J. Li Martin J. Wainwright Peter L. Bartlett Michael I. Jordan 54 76 0 09 Apr 2020
Adaptive Federated Optimization Sashank J. Reddi Zachary B. Charles Manzil Zaheer Zachary Garrett Keith Rush Jakub Konecný Sanjiv Kumar H. B. McMahan FedML 161 1,431 0 29 Feb 2020
On Biased Compression for Distributed Learning Aleksandr Beznosikov Samuel Horváth Peter Richtárik M. Safaryan 50 189 0 27 Feb 2020
PyHessian: Neural Networks Through the Lens of the Hessian Z. Yao A. Gholami Kurt Keutzer Michael W. Mahoney ODL 51 302 0 16 Dec 2019
FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization Amirhossein Reisizadeh Aryan Mokhtari Hamed Hassani Ali Jadbabaie Ramtin Pedarsani FedML 233 772 0 28 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization Xinyan Li Qilong Gu Yingxue Zhou Tiancong Chen A. Banerjee ODL 69 52 0 24 Jul 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations Debraj Basu Deepesh Data C. Karakuş Suhas Diggavi MQ 54 405 0 06 Jun 2019
Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback Shuai Zheng Ziyue Huang James T. Kwok 54 115 0 27 May 2019
Sub-Weibull distributions: generalizing sub-Gaussian and sub-Exponential properties to heavier-tailed distributions M. Vladimirova Stéphane Girard Hien Nguyen Julyan Arbel 66 89 0 13 May 2019
On the Convergence of Adam and Beyond Sashank J. Reddi Satyen Kale Surinder Kumar 87 2,494 0 19 Apr 2019
Communication-efficient distributed SGD with Sketching Nikita Ivkin D. Rothchild Enayat Ullah Vladimir Braverman Ion Stoica R. Arora FedML 39 200 0 12 Mar 2019
Compressing Gradient Optimizers via Count-Sketches Ryan Spring Anastasios Kyrillidis Vijai Mohan Anshumali Shrivastava 31 36 0 01 Feb 2019