ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.12340
  4. Cited By
At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima
  Selection in Asynchronous Training of Neural Networks?

At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?

26 September 2019
Niv Giladi
Mor Shpigel Nacson
Elad Hoffer
Daniel Soudry
ArXivPDFHTML

Papers citing "At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?"

10 / 10 papers shown
Title
Ordered Momentum for Asynchronous SGD
Ordered Momentum for Asynchronous SGD
Chang-Wei Shi
Yi-Rui Yang
Wu-Jun Li
ODL
62
0
0
27 Jul 2024
DropCompute: simple and more robust distributed synchronous training via
  compute variance reduction
DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Niv Giladi
Shahar Gottlieb
Moran Shkolnik
A. Karnieli
Ron Banner
Elad Hoffer
Kfir Y. Levy
Daniel Soudry
35
2
0
18 Jun 2023
TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with
  Adaptive Partial Training
TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with Adaptive Partial Training
Tuo Zhang
Lei Gao
Sunwoo Lee
Mi Zhang
Salman Avestimehr
FedML
34
28
0
14 Apr 2023
DoCoFL: Downlink Compression for Cross-Device Federated Learning
DoCoFL: Downlink Compression for Cross-Device Federated Learning
Ron Dorfman
S. Vargaftik
Y. Ben-Itzhak
Kfir Y. Levy
FedML
32
19
0
01 Feb 2023
Second-order regression models exhibit progressive sharpening to the
  edge of stability
Second-order regression models exhibit progressive sharpening to the edge of stability
Atish Agarwala
Fabian Pedregosa
Jeffrey Pennington
35
26
0
10 Oct 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural
  Network Loss Landscapes
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
DBS: Dynamic Batch Size For Distributed Deep Neural Network Training
DBS: Dynamic Batch Size For Distributed Deep Neural Network Training
Qing Ye
Yuhao Zhou
Mingjia Shi
Yanan Sun
Jiancheng Lv
22
11
0
23 Jul 2020
Pipelined Backpropagation at Scale: Training Large Models without
  Batches
Pipelined Backpropagation at Scale: Training Large Models without Batches
Atli Kosson
Vitaliy Chiley
Abhinav Venigalla
Joel Hestness
Urs Koster
35
33
0
25 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1