Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.06770
Cited By
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
11 November 2024
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis"
17 / 67 papers shown
Title
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Behrooz Ghorbani
Shankar Krishnan
Ying Xiao
ODL
60
323
0
29 Jan 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
82
247
0
18 Jan 2019
Tight Analyses for Non-Smooth Stochastic Gradient Descent
Nicholas J. A. Harvey
Christopher Liaw
Y. Plan
Sikander Randhawa
40
138
0
13 Dec 2018
Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD
Jianyu Wang
Gauri Joshi
FedML
65
232
0
19 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,729
0
11 Oct 2018
The Convergence of Sparsified Gradient Methods
Dan Alistarh
Torsten Hoefler
M. Johansson
Sarit Khirirat
Nikola Konstantinov
Cédric Renggli
163
493
0
27 Sep 2018
Sparsified SGD with Memory
Sebastian U. Stich
Jean-Baptiste Cordonnier
Martin Jaggi
71
749
0
20 Sep 2018
Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization
Jiaxiang Wu
Weidong Huang
Junzhou Huang
Tong Zhang
71
236
0
21 Jun 2018
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
164
1,061
0
24 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.0K
7,152
0
20 Apr 2018
Group Normalization
Yuxin Wu
Kaiming He
210
3,652
0
22 Mar 2018
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
87
1,042
0
13 Feb 2018
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Chengyue Wu
Song Han
Huizi Mao
Yu Wang
W. Dally
125
1,407
0
05 Dec 2017
Gradient Sparsification for Communication-Efficient Distributed Optimization
Jianqiao Wangni
Jialei Wang
Ji Liu
Tong Zhang
74
525
0
26 Oct 2017
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
81
236
0
22 Nov 2016
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
150,006
0
22 Dec 2014
Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization
Alexander Rakhlin
Ohad Shamir
Karthik Sridharan
159
768
0
26 Sep 2011
Previous
1
2