ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02593
  4. Cited By
Nonlinear gradient mappings and stochastic optimization: A general
  framework with applications to heavy-tail noise

Nonlinear gradient mappings and stochastic optimization: A general framework with applications to heavy-tail noise

6 April 2022
D. Jakovetić
Dragana Bajović
Anit Kumar Sahu
S. Kar
Nemanja Milošević
Dusan Stamenkovic
ArXivPDFHTML

Papers citing "Nonlinear gradient mappings and stochastic optimization: A general framework with applications to heavy-tail noise"

18 / 18 papers shown
Title
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Aleksandar Armacki
Shuhua Yu
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
83
2
0
17 Oct 2024
The Heavy-Tail Phenomenon in SGD
The Heavy-Tail Phenomenon in SGD
Mert Gurbuzbalaban
Umut Simsekli
Lingjiong Zhu
45
126
0
08 Jun 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient
  Clipping
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping
Eduard A. Gorbunov
Marina Danilova
Alexander Gasnikov
48
122
0
21 May 2020
The Geometry of Sign Gradient Descent
The Geometry of Sign Gradient Descent
Lukas Balles
Fabian Pedregosa
Nicolas Le Roux
ODL
56
26
0
19 Feb 2020
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep
  Neural Networks
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
Umut Simsekli
Mert Gurbuzbalaban
T. H. Nguyen
G. Richard
Levent Sagun
61
58
0
29 Nov 2019
AdaCliP: Adaptive Clipping for Private SGD
AdaCliP: Adaptive Clipping for Private SGD
Venkatadheeraj Pichapati
A. Suresh
Felix X. Yu
Sashank J. Reddi
Sanjiv Kumar
51
124
0
20 Aug 2019
Algorithms of Robust Stochastic Optimization Based on Mirror Descent
  Method
Algorithms of Robust Stochastic Optimization Based on Mirror Descent Method
A. Juditsky
A. Nazin
A. S. Nemirovsky
Alexandre B. Tsybakov
34
64
0
05 Jul 2019
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
72
459
0
28 May 2019
A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and
  Coordinate Descent
A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent
Eduard A. Gorbunov
Filip Hanzely
Peter Richtárik
96
146
0
27 May 2019
On the Adaptivity of Stochastic Gradient-Based Optimization
On the Adaptivity of Stochastic Gradient-Based Optimization
Lihua Lei
Michael I. Jordan
ODL
63
22
0
09 Apr 2019
signSGD: Compressed Optimisation for Non-Convex Problems
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
87
1,041
0
13 Feb 2018
Optimization Methods for Large-Scale Machine Learning
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
219
3,205
0
15 Jun 2016
Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
Horia Mania
Xinghao Pan
Dimitris Papailiopoulos
Benjamin Recht
Kannan Ramchandran
Michael I. Jordan
89
232
0
24 Jul 2015
Convex Optimization for Big Data
Convex Optimization for Big Data
Volkan Cevher
Stephen Becker
Mark Schmidt
67
302
0
04 Nov 2014
Robust Consensus in the Presence of Impulsive Channel Noise
Robust Consensus in the Presence of Impulsive Channel Noise
Sivaraman Dasarathan
C. Tepedelenlioğlu
M. Banavar
A. Spanias
72
23
0
16 Aug 2014
On the difficulty of training Recurrent Neural Networks
On the difficulty of training Recurrent Neural Networks
Razvan Pascanu
Tomas Mikolov
Yoshua Bengio
ODL
182
5,334
0
21 Nov 2012
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient
  Descent
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Feng Niu
Benjamin Recht
Christopher Ré
Stephen J. Wright
177
2,273
0
28 Jun 2011
Distributed Parameter Estimation in Sensor Networks: Nonlinear
  Observation Models and Imperfect Communication
Distributed Parameter Estimation in Sensor Networks: Nonlinear Observation Models and Imperfect Communication
S. Kar
José M. F. Moura
K. Ramanan
151
468
0
29 Aug 2008
1