ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.02913
  4. Cited By
Decentralized SGD and Average-direction SAM are Asymptotically
  Equivalent

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

5 June 2023
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Mingli Song
Dacheng Tao
ArXivPDFHTML

Papers citing "Decentralized SGD and Average-direction SAM are Asymptotically Equivalent"

12 / 12 papers shown
Title
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data
Sakshi Choudhary
Sai Aparna Aketi
Kaushik Roy
FedML
50
0
0
22 May 2024
Layer-wise Linear Mode Connectivity
Layer-wise Linear Mode Connectivity
Linara Adilova
Maksym Andriushchenko
Michael Kamp
Asja Fischer
Martin Jaggi
FedML
FAtt
MoMe
51
14
0
13 Jul 2023
Decentralized Adversarial Training over Graphs
Decentralized Adversarial Training over Graphs
Ying Cao
Elsa Rizk
Stefan Vlaski
Ali H. Sayed
AAML
71
1
0
23 Mar 2023
SWIFT: Rapid Decentralized Federated Learning via Wait-Free Model
  Communication
SWIFT: Rapid Decentralized Federated Learning via Wait-Free Model Communication
Marco Bornstein
Tahseen Rabbani
Evana Wang
Amrit Singh Bedi
Furong Huang
FedML
54
18
0
25 Oct 2022
Anticorrelated Noise Injection for Improved Generalization
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurelien Lucchi
81
44
0
06 Feb 2022
BlueFog: Make Decentralized Algorithms Practical for Optimization and
  Deep Learning
BlueFog: Make Decentralized Algorithms Practical for Optimization and Deep Learning
Bicheng Ying
Kun Yuan
Hanbin Hu
Yiming Chen
W. Yin
FedML
44
28
0
08 Nov 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary
  regime
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
39
4
0
07 Nov 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
135
98
0
16 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural
  Networks
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Qiufeng Wang
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
116
133
0
07 Oct 2021
DecentLaM: Decentralized Momentum SGD for Large-batch Deep Training
DecentLaM: Decentralized Momentum SGD for Large-batch Deep Training
Kun Yuan
Yiming Chen
Xinmeng Huang
Yingya Zhang
Pan Pan
Yinghui Xu
W. Yin
MoE
60
62
0
24 Apr 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
318
2,908
0
15 Sep 2016
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN
3DV
368
36,493
0
25 Aug 2016
1